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Abstract 


This  report  documents  research  efforts  in  quality  assessment,  recruitment, 
training,  and  the  development  of  application  extensions  for  the  George 
Mason  University  (GMU)  Geocrowdsourcing  Testbed.  The  GMU  Geo¬ 
crowdsourcing  Testbed  is  designed  to  capture,  evaluate,  and  utilize 
crowdsourced  geospatial  data  associated  with  transient  obstacles  and  nav¬ 
igation  hazards  in  the  region  surrounding  the  GMU  campus  in  Fairfax, 
Virginia.  We  present  our  quality  assessment  research  based  on  best  prac¬ 
tices,  and  discuss  its  deployment  within  our  system.  We  present  our  train¬ 
ing  and  recruitment  program  and  discuss  its  future  directions  and  future 
efforts  to  recruit  and  train  participants.  Finally,  we  present  extensions  of 
our  geocrowdsourcing  testbed  in  areas  of  accessible  routing  and  visualiza¬ 
tion,  which  are  ongoing  focus  areas  for  our  research.  The  results  of  this  re¬ 
search  have  military  application  for  hazard  identification  and  reporting  in 
similarly  built  environments,  as  well  as  for  navigation  by  disabled  soldiers 
and  veterans. 
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be  construed  as  an  official  Department  of  the  Army  position  unless  so  designated  by  other  authorized  documents. 
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1  Introduction  and  Background 


In  August  2005,  Hurricane  Katrina  struck  the  Louisiana  and  Mississippi 
coastlines,  causing  an  estimated  108  billion  dollars  in  damage  and  more 
than  1,800  fatalities.  It  was  the  costliest  tropical  storm  and  one  of  the 
deadliest  tropical  storms  to  ever  hit  the  United  States.1 2  FEMA,  NOAA,  and 
several  other  federal  agencies  used  remote  sensing,  geographic  infor¬ 
mation  systems  (GIS),  and  other  geospatial  tools  to  provide  forecasts,  pre¬ 
dict  storm  surges,  map  inundation,  and  afterward,  assess  the  massive 
damage  caused  by  the  storm.  FEMA’s  inundation  map  of  the  Mississippi 
Gulf  Coast  (excerpt,  Figure  1),  produced  with  GIS  using  detailed  elevation 
data,  underscores  the  usefulness  of  GIS  in  predicting  and  assessing  the 
dynamics  of  tropical  storms.  The  post- Katrina  reconstruction  along  the 
Gulf  Coast  has  been  guided  by  these  FEMA  “Katrina  Recovery  Maps,”  pro¬ 
duced  with  the  help  of  GIS. 


Figure  1.  Excerpt,  FEMA  August  2005  inundation  map  of  Mississippi  coast.2 


1  Richard  D.  Knabb,  Jamie  R.  Rhome,  and  Daniel  P.  Brown,  Tropical  Cyclone  Report:  Hurricane  Katrina, 
23-30  August  2005  (National  Hurricane  Center,  2005). 


2  http://www.fema.gov/pdf/hazard/flood/recoverydata/ms  overview.pdf  [accessed  September  4, 
2014] 
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In  January  2010,  another  natural  disaster  struck  when  a  catastrophic 
earthquake  hit  Port-au-Prince,  Haiti,  causing  an  estimated  15  billion  dol¬ 
lars  in  damage  and  killing  more  than  150,000  people.  Without  the  benefit 
of  elaborate  predictive  models,  high-resolution  elevation  data,  and  a  sig¬ 
nificant  government  GIS  capability,  the  relief  efforts  and  prospects  in  Haiti 
looked  much  bleaker  than  those  in  New  Orleans  less  than  five  years  earli¬ 
er. 

The  time  period  between  the  2005  Katrina  event  and  the  2010  Haitian 
Earthquake  saw  several  significant  technological  changes  and  develop¬ 
ments,  most  notably,  the  increase  in  the  public’s  awareness  and  use  of  so¬ 
cial  media.  The  public  engagement  in  the  production  of  information  and 
the  sharing  of  that  information  on  the  Internet  has  been  an  important  cul¬ 
tural  development  over  the  time  period.  Howe  (2006,  2008)  characterizes 
one  significant  form  of  public  engagement  as  crowdsourcing,  where  “a  task 
traditionally  performed  by  a  designated  agent  is  outsourced  by  making  an 
open  call  to  an  undefined  but  large  group  of  people”>4>5  The  permeation  of 
Howe’s  crowdsourcing  concept  into  the  geospatial  community  between 
2006  and  2010  resulted  in  a  very  fortunate  confluence  of  people,  technol¬ 
ogy,  and  social  movements,  described  best  by  Zook  et  al.  (2010),  where 
crowdsourcing,  GIS,  citizen-led  open  source  mapping,  non-profit  organi¬ 
zations,  and  government  agencies  collaborated  in  a  large  volunteered 
mapping  and  data  generation  effort  to  assist  disaster  relief  efforts  in  Haiti 
immediately  following  the  earthquake.  Zook  et  al.  describe  this  effort  as  “a 
remarkable  example  of  the  power  of  crowdsourced  online  mapping  and 
the  potential  for  new  avenues  of  interaction  between  physically  distant 
places”.3 4 5 6  The  geocrowdsourcing  efforts  described  by  Zook  et  al.  may  turn 
out  to  be  historical  hallmark  events  in  the  evolution  of  GIS  toward  an  end- 
user-centered,  open  system. 


3  Jeff  Howe,  “The  Rise  of  Crowdsourcing,"  Wired  Magazine  14,  no.  6  (2006):  1-4. 

4  Jeff  Howe,  Crowdsourcing :  Why  the  Power  of  the  Crowd  Is  Driving  the  Future  of  Business  (New  York: 
Crown  Business,  2008). 

5  http://www.bizbriefings.com/Samples/lntlnst%20— %20Crowdsourcing.PDF  [accessed  Sep.  4,  2014] 

6  Matthew  Zook  et  al.,  “Volunteered  Geographic  Information  and  Crowdsourcing  Disaster  Relief:  A  Case 
Study  of  the  Haitian  Earthquake,”  World  Medical  &  Health  Policy  2,  no.  2  (July  21,  2010):  6-32, 
doi:10. 2202/1948-4682. 1069.  p.7. 
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Crowdsourced  Geospatial  Data 

One  of  the  most  important  and  strategic  contemporary  trends  in  the  geo¬ 
spatial  sciences,  underscored  by  the  Haitian  earthquake  response,  is  the 
use  of  map-based  crowdsourcing  for  collecting,  confirming,  editing,  and 
displaying  geospatial  data.  Goodchild  (2007,  2009)  and  many  other  recent 
authors  cite  several  significant  benefits  associated  with  this  general  ap¬ 
proach;  namely,  the  local  geographic  expertise  of  the  contributors,  who  are 
more  familiar  with  the  local  features  being  mapped;  the  speed  with  which 
information  can  be  collected  and  mapped;  and  finally,  the  greatly  reduced 
costs  associated  with  what  is  typically  a  very  expensive  activity.?.7 8 9 

In  the  military  and  intelligence  communities,  the  field-based  collection  of 
time-relevant  geographic  information  is  a  critical  aspect  for  supporting 
operations,  particularly  in  urban  environments,  where  people,  places,  ac¬ 
tivities,  events,  and  other  items  of  interest  change  very  quickly.  There  is 
often  no  practical  way  to  capture  data  about  the  location  and  nature  of 
these  rapidly  unfolding  geographic  events  using  traditional  geospatial  data 
collection  methods.  In  many  settings  and  circumstances,  traditional  data 
collection  methods  work  well,  but  under  other  circumstances,  geo¬ 
crowdsourcing  may  offer  a  distinct  advantage. 

In  their  2012  technical  report,  Rice  et  al.  provide  a  comprehensive  over¬ 
view  of  the  emerging  phenomena  of  crowdsourced  geospatial  data  and  the 
advantages  associated  with  this  data  production  paradigm.  They  compare 
and  contrast  geocrowdsourcing  techniques  with  traditional  geospatial  data 
production  activities,  discuss  quality  assessment  methods,  and  review  sev¬ 
eral  emerging  geocrowdsourcing  applications.9  As  concluded  in  the  final 
chapter  of  their  report,  crowdsourced  geospatial  data  presents  many  stra¬ 
tegic  advantages  and  significant  challenges,  and  can  be  characterized  as  an 
important  additional  tool  within  the  complete  toolkit  available  to  the  geo¬ 
spatial  community.  The  methods  used  and  benefits  obtained  by  prominent 
geocrowdsourcing  practitioners  in  other  domains  (emergency  manage- 


7  Michael  F.  Goodchild,  “Citizens  as  Sensors:  The  World  of  Volunteered  Geography,”  G eoJournal  69,  no. 
4  (December  2007):  211-21. 

8  Michael  F.  Goodchild,  “NeoGeography  and  the  Nature  of  Geographic  Expertise,”  Journal  of  Location 
Based  Services  3,  no.  2  (June  2009):  82-96,  doi:10. 1080/17489720902950374. 

9  Matthew  T.  Rice  et  al.,  Crowdsourced  Geospatial  Data:  A  Report  on  the  Emerging  Phenomena  of 
Crowdsourced  and  User-Generated  Geospatial  Data,  Annual  (Fairfax,  VA:  George  Mason  University, 
November  29,  2012),  http://www.dtic.mil/dtic/tr/fulltext/u2/a576607.pdf. 
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ment,  humanitarian  response,  natural  resource  protection,  transportation, 
and  accessibility)  can  offer  insight  to  practitioners  in  the  military  or  intel¬ 
ligence  domains,  where  geocrowdsourcing  techniques  can  offer  benefits, 
but  should  be  considered  carefully.  Many  of  these  compelling  application 
domains  and  resulting  lessons  learned  have  been  characterized  and  dis¬ 
cussed  by  Rice  et  al.  (2011,  2012a,  2012b,  20i3)10’11’12>135  Zook  et  al. 
(20io)14,  Sui  et  al.  (20i3)1s,  and  Liu  et  al.  (2010)16.  Where  relevant  and 
useful,  conclusions  and  insights  from  these  works  will  be  presented  in  this 
report. 

Crowdsourcing  Transient  Navigation  Obstacles 

To  extend  previous  research  work  (Rice  et  al.  2005,  Golledge  et  al.  2005, 
Golledge  et  al.  2006),17’18,i9  and  to  provide  a  useful  application  of  geo¬ 
crowdsourcing,  Rice  et  al.  (2013)  presented  the  conceptual  design  of  a  sys¬ 
tem  for  collecting  transient  obstacle  information  to  assist  blind,  visually- 
impaired,  and  mobility-impaired  individuals  navigate  through  unfamiliar 
environments. 


10Rice  et  al.,  “Integrating  User-Contributed  Geospatial  Data  with  Assistive  Geotechnology  Using  a  Local¬ 
ized  Gazetteer,"  in  Advances  in  Cartography  and  GiScience.  Volume  1,  ed.  Anne  Ruas,  Lecture  Notes 
in  Geoinformation  and  Cartography  (Springer  Berlin  Heidelberg,  2011),  279-91, 
http://dx.doi.org/10.1007/978-3-642-19143-5_16. 

11  Rice  et  al.,  Crowdsourced  Geospatial  Data:  A  Report  on  the  Emerging  Phenomena  of  Crowdsourced 
and  User-Generated  Geospatial  Data. 

12  Matthew  T.  Rice  et  al.,  “Supporting  Accessibility  for  Blind  and  Vision-Impaired  People  With  a  Localized 
Gazetteer  and  Open  Source  Geotechnology,’’  Transactions  in  GiS  16,  no.  2  (April  2012):  177-90, 
doi:  10. 1111/j.  1467-967 1.2012.01318.x. 

13  Matthew  T.  Rice  et  al.,  Crowdsourcing  to  Support  Navigation  for  the  Disabled:  A  Report  on  the  Motiva¬ 
tions,  Design,  Creation  and  Assessment  of  a  Testbed  Environment  for  Accessibility,  US  Army  Corps  of 
Engineers,  Engineer  Research  and  Development  Center,  US  Army  Topographic  Engineering  Center 
Technical  Report,  Data  Level  Enterprise  Tools  Workgroup  (Fairfax,  VA:  George  Mason  University,  Sep¬ 
tember  2013), 

http://oai.dtic.  mil/oai/oai?verb=getRecord&metadataPrefix=html&identifier=ADA588474. 

14  Zook  et  al.,  “Volunteered  Geographic  Information  and  Crowdsourcing  Disaster  Relief.” 

15  Daniel  Sui,  Sarah  Elwood,  and  Michael  F.  Goodchild,  eds.,  Crowdsourcing  Geographic  Knowledge 
Volunteered  Geographic  Information  (VGI)  in  Theory  and  Practice.  (New  York,  NY:  Springer,  2013). 

16  S.  B  Liu  and  L.  Palen,  “The  New  Cartographers:  Crisis  Map  Mashups  and  the  Emergence  of  Neogeo¬ 
graphic  Practice,’’  Cartography  and  Geographic  Information  Science  37,  no.  1  (2010):  69-90. 

17  Matt  Rice  et  al.,  “Design  Considerations  for  Haptic  and  Auditory  Map  Interfaces,"  Cartography  and 
Geographic  Information  Science  32,  no.  4  (2005):  381-91. 

18  Reginald  G.  Golledge,  Matthew  Rice,  and  Daniel  Jacobson,  “A  Commentary  on  the  Use  of  Touch  for 
Accessing  On-Screen  Spatial  Representations:  The  Process  of  Experiencing  Haptic  Maps  and 
Graphics,’’  The  Professional  Geographer  57,  no.  3  (August  2005):  339-49,  doi:10. 1111/j. 0033- 
0124. 2005. 00482. x. 

19  Reginald  G.  Golledge,  Matthew  T.  Rice,  and  R.  Daniel  Jacobson,  “Multimodal  Interfaces  for  Represent¬ 
ing  and  Accessing  Geospatial  Information,”  in  Frontiers  of  Geographic  Information  Technology  (Spring¬ 
er,  2006),  181-208. 
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Transient  obstacles  (Figure  2)  are  a  difficult  navigation  challenge  because 
they  are  usually  unplanned,  unmapped,  unpredictable,  and  temporary. 
Navigation  systems  and  geoassistive  technology,  such  as  the  UCSB  per¬ 
sonal  guidance  system  (Loomis  et  al.  2005,  Figure  3)20,  offer  support  to 
the  blind,  visually-impaired,  and  mobility-impaired  community,  but  lack 
the  ability  to  incorporate  real-time  event  and  obstacle  information.  Several 
authors,  notably  Nuernberger  (2008),  Barbeau  et  al.  (2010),  Harrington  et 
al.  (2013),  and  Matuska  (2014)  have  used  communication  devices  and 
modeling  techniques  to  increase  the  amount  of  information  about  sur¬ 
roundings  and  unplanned,  transient  events  for  blind,  visually-impaired, 
and  mobility-impaired  travelers. 21’22’23’24 


Figure  2.  Transient  Navigation  Obstacle 


20  Jack  M.  Loomis  et  al.,  “Personal  Guidance  System  for  People  with  Visual  Impairment:  A  Comparison  of 
Spatial  Displays  for  Route  Guidance,’’  Journal  of  Visual  Impairment  &  Blindness  99,  no.  4  (2005):  219. 

21  Andrea  Nuernberger,  “Presenting  Accessibility  to  Mobility-Impaired  Travelers"  (UCTC  Dissertation, 
University  of  California  Transportation  Center,  2008). 

22  Sean  J.  Barbeau  etal.,  "Travel  Assistance  Device:  Utilising  Global  Positioning  System-Enabled  Mobile 
Phones  to  Aid  Transit  Riders  with  Special  Needs,”  Intelligent  Transport  Systems,  IET  4,  no.  1  (2010): 
12-23. 

23  Naomi  Harrington  etal.,  "Beyond  User  Interfaces  in  Mobile  Accessibility:  Not  Just  Skin  Deep,”  in 
Communications,  Computers  and  Signal  Processing  (PACRIM),  2013  IEEE  Pacific  Rim  Conference  on 
(IEEE,  2013),  322-29. 

24  Jaroslav  Matuska,  “Railway  System  Accessibility  Evaluation  for  Wheelchair  Users:  Case  Study  in  the 
Czech  Republic,"  Transport,  no.  ahead-of-print  (2014):  1-12. 
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Figure  3.  UCSB  Personal  Guidance  System,  circa  2003 


Crowdsourced  Geospatial  Data  and  Accessibility 

Previous  technical  reports  prepared  for  the  U.S.  Army  Corps  of  Engineers, 
Engineer  Research  and  Development  Center,  available  from  the  Defense 
Technical  Information  Center,  addressed  underlying  fundamental  issues. 
The  first  report  (Rice  et  al.  2012a)  2s  addressed  the  emerging  trend  of 
crowdsourced  geospatial  data,  with  a  comprehensive  discussion  of  chang¬ 
ing  geospatial  production  paradigms,  a  review  of  geocrowdsourcing  appli¬ 
cations,  a  discussion  of  quality  assessment  methods  adapted  from  tradi¬ 
tional  approaches,  a  review  of  evaluation  methods  and  considerations  for 
crowdsourced  geospatial  data,  and  a  synopsis  of  significant  trends  and  les¬ 
sons  learned.  The  second  report  (Rice  et  al.  2013) 26  reviewed  the  domain 
of  geocrowdsourcing  for  accessibility  and  introduced  the  GMU  Geo¬ 
crowdsourcing  Testbed  prototype,  developed  to  crowdsource  and  display 
transient  obstacles  and  navigation  hazards.  The  report  also  updated  the 
summary  of  emerging  trends  in  geocrowdsourcing. 


25  Rice  et  al.,  Crowdsourced  Geospatial  Data:  A  Report  on  the  Emerging  Phenomena  of  Crowdsourced 
and  User-Generated  Geospatial  Data. 

26  Rice  et  al.,  Crowdsourcing  to  Support  Navigation  for  the  Disabled:  A  Report  on  the  Motivations,  Design, 
Creation  and  Assessment  of  a  Testbed  Environment  for  Accessibility. 
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This  report  builds  on  both  previous  reports,  and  presents  a  body  of  re¬ 
search  work  associated  with  training  and  recruitment  in  geocrowdsourc¬ 
ing,  quality  assessment  of  geocrowdsourced  data,  and  the  experimental 
use  of  the  GMU  Geocrowdsourcing  Testbed  environment  for  accessible 
routing  and  data  visualization.  The  second  chapter  of  this  report  discusses 
the  development  of  quality  assessment  protocols  and  functional  quality 
assessment  moderation  within  our  geocrowdsourcing  testbed.  The  third 
chapter  addresses  training  activities,  recruitment  activities,  findings,  and 
conclusions.  The  fourth  chapter  of  this  report  addresses  experimental  ef¬ 
forts  to  create  accessible  routing  through  our  testbed  environment,  along 
with  preliminary  results  and  conclusions.  The  fifth  chapter  of  this  report 
revisits  quality  assessment  and  our  efforts  to  create  effective  visualization 
techniques  to  assess  the  dynamics  and  data  quality  within  our  system.  Fi¬ 
nally,  this  report  ends  with  a  summary  of  activity  and  future  plans  for  our 
work. 
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2  Quality  Assessment  and  Moderation  in  the 
GMU  Geocrowdsourcing  Testbed 

Quality  assessment  is  crucial  for  crowdsourced  geographic  data  (CGD),  as 
it  provides  a  way  of  measuring,  understanding,  and  communicating  criti¬ 
cal  aspects  of  quality,  and  therefore  provides  information  to  decision  mak¬ 
ers  and  end-users.  A  determination  of  the  value  and  quality  of  information 
is  critical  to  understanding  whether  it  can  be  used  appropriately  for  a  giv¬ 
en  purpose. 

Although  the  meanings  of  the  terms  “value”  and  “quality”  are  often  subjec¬ 
tive  and  based  on  circumstance  and  context,  generally  accepted  notions  of 
quality  in  the  geospatial  domain  include  determinations  about  the  posi¬ 
tional,  temporal,  and  attribute  accuracy  of  the  information,  the  complete¬ 
ness  and  coverage  of  the  data,  and  its  sufficiency  for  any  particular  appli¬ 
cation.  Guptill  and  Morrison  (1995), 27  Veregin  (1999), 28  and  others  have 
refined  what  we  now  consider  to  be  the  most  important  elements  of  spatial 
data  quality:  positional  accuracy,  attribute  accuracy,  completeness,  logical 
consistency,  semantic  accuracy,  temporal  accuracy,  and  lineage.  These 
quality  assessment  items  and  others  relevant  to  crowdsourced  geospatial 
data  are  reviewed  by  Rice  et  al.  2012a, 29  Rice  et  al.  2013,2°  and  are  articu¬ 
lated  by  Girres  and  Touya  (20io).31  These  items  will  not  be  addressed  in¬ 
dividually  in  exhaustive  form,  having  been  covered  in  earlier  reports,  but 
will  be  discussed  in  this  chapter  as  they  pertain  to  the  quality  assessment 
work  in  the  GMU  Geocrowdsourcing  Testbed  environment,  introduced  in 
Rice  et  al.  2013,22  and  extended  during  this  recent  research  phase. 


27  Stephen  C.  Guptill,  Joel  L.  Morrison,  and  International  Cartographic  Association,  Elements  of  Spatial 
Data  Quality,  vol.  202  (Elsevier  Science  Oxford,  1995). 

28  Howard  Veregin,  “Data  Quality  Parameters,”  Geographical  Information  Systems  1  (1999):  177-89. 

29  Rice  et  al.,  Crowdsourced  Geospatial  Data:  A  Report  on  the  Emerging  Phenomena  of  Crowdsourced 
and  User-Generated  Geospatial  Data. 

30  Rice  et  al.,  Crowdsourcing  to  Support  Navigation  for  the  Disabled:  A  Report  on  the  Motivations,  De¬ 
sign,  Creation  and  Assessment  of  a  Testbed  Environment  for  Accessibility. 

31  Jean-Frangois  Girres  and  Guillaume  Touya,  “Quality  Assessment  of  the  French  OpenStreetMap  Da¬ 
taset,”  Transactions  in  GIS  14,  no.  4  (August  2010):  435-59,  doi:10.1111/j.l467- 
9671.2010.01203.x.  P.  439-440 

32  Rice  et  al.,  Crowdsourcing  to  Support  Navigation  for  the  Disabled:  A  Report  on  the  Motivations,  De¬ 
sign,  Creation  and  Assessment  of  a  Testbed  Environment  for  Accessibility. 
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The  Nature  of  Error  in  Geospatial  Data 

Hunter  et  al.  (1992)  addresses  quality  in  geospatial  data  by  articulating  the 
relationship  between  sources  of  error,  forms  of  error,  and  resulting  errors 
that  exist  in  geospatial  data  (Figure  4).  Hunter  and  Beard  provide  a  useful 
perspective  on  quality,  noting  that  error  may  be  inherent  in  the  infor¬ 
mation  acquired  for  a  project  or  it  may  be  separately  introduced  by  the  ac¬ 
tions  of  the  user  in  processing,  managing,  or  analyzing  the  data  in  a  geo¬ 
graphic  information  system  (GIS)  (1992,  io8).33 


Figure  4.  Hunter  et  al.  1992,  Classification  of  Error  in  GIS,  from 
“Understanding  Error  in  Spatial  Databases” 

Based  on  a  concept  pioneered  by  landscape  architect  Ian  McHarg  and  ar¬ 
ticulated  in  Design  with  Nature  (1969), 34  GIS  uses  a  map  overlay  tech¬ 
nique  where  several  thematic  layers  are  combined  to  create  a  composite 
layer  that  contains  elements  of  all  the  inputs.  The  map  overlay  is  then  used 
to  address  geographic  problems.  Figure  5,  from  Hill  (2006)35  shows  a  typi¬ 
cal  combination  of  thematic  layers,  each  of  which  has  its  own  unique  char¬ 
acteristics. 


33  Gary  J.  Hunter  and  Kate  Beard,  “Understanding  Error  in  Spatial  Databases,”  Australian  Surveyor  37, 
no.  2  (1992):  108-19. 

34  Ian  L.  McHarg  and  Lewis  Mumford,  Design  with  Nature  (American  Museum  of  Natural  History  New 
York,  1969). 

35  Linda  L.  Hill,  G eoreferencing:  The  Geographic  Associations  of  Information,  Digital  Libraries  and  Elec¬ 
tronic  Publishing  (Cambridge,  Mass:  MIT  Press,  2006). 
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Figure  5.  Thematic  Layers  in  a  GIS  used  for  Map  Overlay,  from  Hill  (2006). 
Used  with  permission  (Cartomedia.com). 

Goodchild  and  Gopal  (1989)36  suggest  that  the  cumulative  effect  of  posi¬ 
tional  errors  in  various  thematic  layers  during  a  GIS  overlay  (Figure  5)  is 
difficult  to  ascertain  and  may  require  multiple  models  for  error.  Assessing 
quality  in  geospatial  data  can  be  complex  and  difficult.  A  comprehensive 
review  of  quality  assessment  concepts  for  crowdsourced  geospatial  data  is 
contained  in  Chapter  4  of  Rice  et  al.  (20i2a)37  and  Chapter  3  of  Rice  et  al. 
(2013), 38  and  practical  approaches  relevant  to  CGD  are  addressed  in  the 
same  works. 

The  following  sections  of  this  chapter  will  discuss  general  quality  assess¬ 
ment  research  and  accepted  practices,  including  references  from  key  pub¬ 
lications.  Following  this  discussion,  there  will  be  an  explanation  of  how 
these  quality  assessment  concepts  are  implemented  and  measured  in  our 
system. 

Quality  Assessment:  Positional  Accuracy 

For  geospatial  data  produced  by  U.S.  Federal  agencies,  standards  for  qual¬ 
ity  assessment  have  been  developed  and  are  widely  used.  Similar  stand- 


36  Michael  F.  Goodchild  and  Sucharita  Gopal,  The  Accuracy  of  Spatial  Databases  (London;  New  York: 
Taylor  &  Francis,  1989). 

37  Rice  et  al.,  Crowdsourced  Geospatial  Data:  A  Report  on  the  Emerging  Phenomena  of  Crowdsourced 
and  User-Generated  Geospatial  Data. 

38  Rice  et  al.,  Crowdsourcing  to  Support  Navigation  for  the  Disabled:  A  Report  on  the  Motivations,  Design, 
Creation  and  Assessment  of  a  Testbed  Environment  for  Accessibility. 
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ards  and  approaches  are  used  in  industry.  The  National  Map  Accuracy 
Standards  (NMAS),  developed  in  the  early  1940s  and  published  in  1947, 
are  applicable  to  printed  and  fixed-scale  maps.  They  specified  that  90%  of 
positional  errors  for  easily  identified  features  should  be  1/30  of  an  inch  at 
map  scale  for  maps  produced  at  a  scale  of  1:20000  or  larger  (more  de¬ 
tailed),  and  1/50  of  an  inch  for  maps  produced  at  a  smaller  (less  detailed) 
scale.39  A  more  relevant  contemporary  approach  for  assessing  accuracy  is 
the  National  Standard  for  Spatial  Data  Accuracy  (NSSDA),  which  uses  sta¬ 
tistical  methodology  for  estimating  the  positional  accuracy  of  maps  and 
geospatial  data.4°  There  is  no  single  threshold  value,  as  in  the  NMAS,  but 
federal  agencies  that  produce,  collect,  or  use  geospatial  data  are  encour¬ 
aged  to  set  their  own  standards  for  acceptable  accuracies  and  report  accu¬ 
racies  using  the  methodology  outlined  in  NSSDA.  The  NSSDA  uses  the 
root-mean  square  error  statistical  error  measure  (RMSE),  which  is  the 
square  root  of  the  average  squared  deviations  of  sampled  points  from  a 
source  of  ground  truth.  The  results  of  the  NSSDA-based  positional  accura¬ 
cy  assessment  are  reported  using  a  95%  confidence  interval,  which  implies 
that  less  than  5%  of  observations  will  have  a  positional  error  greater  than 
the  reported  error  confidence  limits.  The  NSSDA  acknowledges  that  geo¬ 
spatial  datasets  typically  have  multiple  layers,  each  with  its  own  character¬ 
istics,  and  possibly,  differing  accuracies.  For  complex,  composite  datasets 
with  multiple  input  layers,  the  NSSDA  suggests: 

1.  If  data  of  varying  accuracies  can  be  identified  separately  in  a  da¬ 
taset,  compute  and  report  separate  accuracy  values. 

2.  If  data  of  varying  accuracies  are  composited  and  cannot  be  sepa¬ 
rately  identified  AND  the  dataset  is  tested,  report  the  accuracy  value 
for  the  composited  data. 

3.  If  a  composited  dataset  is  not  tested,  report  the  accuracy  value  for 
the  least  accurate  dataset  components1 

In  the  GMU  Geocrowdsourcing  Testbed,  positional  accuracy  is  assessed 
for  each  individual  report  contributed  to  our  system,  and  in  this  way,  our 


39  “United  States  National  Map  Accuracy  Standards’’  (U.S.  Bureau  of  the  Budget,  1947), 
http://nationalmap.gov/standards/pdf/NMAS647.PDF;  “National  Geospatial  Data  Standards  -  United 
States  National  Map  Accuracy  Standards,’’  USGS,  October  28,  2011, 

http://nationalmap.gov/standards/nmas.html;  Paul  A.  Longley  et  al.,  Geographic  Information  Systems 
and  Science,  3rd  edition  (Hoboken,  New  Jersey:  John  Wiley  &  Sons,  2011).  §6.3.3,  p.164. 

40  U.S.  Geological  Survey,  “Geospatial  Positioning  Accuracy  Standards,  Part  3:  National  Standard  for 
Spatial  Data  Accuracy,”  Federal  Geographic  Data  Committee,  August  19,  2008. 

41  Rice  et  al.,  Crowdsourced  Geospatial  Data:  A  Report  on  the  Emerging  Phenomena  of  Crowdsourced 
and  User-Generated  Geospatial  Data.  Chapter  4,  p.  67. 
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approach  is  most  similar  to  NSSDA  condition  one,  where  separate  accura¬ 
cy  values  can  be  obtained  and  reported. 

Positional  accuracy  studies  have  been  conducted  for  crowdsourced  geospa¬ 
tial  data,  primarily  by  comparing  OpenStreetMap  (OSM)  data  to  a  source 
of  known,  higher  accuracy.  The  results  of  these  studies  are  summarized  in 
Ruitton-Allinieu  (2011). 42  Haklay’s  2010  study  of  OSM  data  in  the  United 
Kingdom43  demonstrated  that  the  positional  accuracy  of  OSM  roads  data, 
when  compared  to  authoritative  Ordnance  Survey  data,  was  within  six  me¬ 
ters.  Girres  et  al.  (2010)44  performed  a  quality  assessment  of  the  French 
OSM  datasets  with  similar  findings.  They  addressed  a  comprehensive  set 
of  quality  measures,  including  positional  (geometric)  accuracy,  attribute 
accuracy,  completeness,  logical  consistency,  semantic  accuracy,  temporal 
accuracy,  lineage,  and  usage.  With  regard  to  positioning  of  features  in 
their  sample,  they  determined  that  the  Euclidean  distance  between  match¬ 
ing  intersection  points  in  the  road  networks  averaged  6.65  meters,  with  a 
maximum  of  31.58  meters  and  a  minimum  of  0.68  meters. 

In  the  GMU  Geocrowdsourcing  Testbed  (Rice  et  al.  2013),  positional  accu¬ 
racy  assessments  (and  all  other  quality  assessments)  are  done  by  modera¬ 
tors  for  the  individual  reports  contributed  to  our  system,  which  are  then 
used  to  create  obstacles.  These  report-level  quality  assessment  statistics 
are  inherited  by  the  obstacles  during  the  obstacle  creation  process,  and  are 
a  direct  reflection  of  the  quality  of  the  source  report(s).  In  general,  all  the 
quality  statistics  and  quality  assessment  practices  for  reports  also  apply  to 
the  obstacles  generated  from  the  reports. 

The  positional  accuracy  characteristics  of  the  reports  in  our  GMU  Geo¬ 
crowdsourcing  Testbed  are  determined  from  a  comparison  of  the  contribu¬ 
tor’s  position  estimate,  derived  from  the  positioning  of  an  icon  on  the  map, 
and  the  moderator’s  field-checked  position  for  the  report.  The  difference 
between  these  positions  is  calculated  with  spherical  formulas  and  convert¬ 
ed  to  meters.  The  median  positional  accuracy  for  our  reports  is  2.236  me¬ 
ters,  and  the  average  positional  accuracy  is  18.36  meters,  with  a  standard 


42  Anne-Marthe  Ruitton-Allinieu,  “Crowdsourcing  of  Geoinformation:  Data  Quality  and  Possible  Appli¬ 
cations”  (Master  of  Science,  Aalto  University,  2011), 
http://maa.aalto.fi/fi/geoinformatiikan_tutkimusryhma- 
gma/geoinformatiikka_ja_kartografia/201  l_ruitton-allinieu_a.pdf. 

43  M.  Haklay,  “How  Good  Is  Volunteered  Geographical  Information?  A  Comparative  Study  of  Open¬ 
StreetMap  and  Ordnance  Survey  Datasets,”  2008. 

44  Girres  and  Touya,  "Quality  Assessment  of  the  French  OpenStreetMap  Dataset.” 
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deviation  of  75.36  meters.  The  minimum  positional  error  in  our  testbed  is 
o,  while  the  maximum  positional  error  is  447.76  meters.  This  large  aver¬ 
age  positional  error  for  reports  (18.36  meters)  is  strongly  influenced  by 
two  reports  with  unusually  high  positional  errors  (322.86  meters  and 
447.76  meters).  In  these  two  cases,  the  report  contributor  failed  to  re¬ 
position  the  blue  location  icon  (close  to  the  centers  of  Figure  6,  Figure  7, 
and  Figure  8)  from  its  default  location,  resulting  in  large  positional  errors. 
Without  these  two  reports  included,  the  average  positional  error  of  reports 
in  our  system  is  4.59  meters  and  median  positional  error  is  1.86  meters.  To 
avoid  errors  of  this  type  in  the  future,  we  changed  the  default  behavior  of 
our  contribution  system  and  now  require  contributors  to  reposition  the 
blue  location  icon  before  reports  can  be  submitted.  The  recently  updated 
mobile  report  contribution  tool  uses  the  device  GPS  for  report  positioning 
and  should  eliminate  positional  errors  due  to  incorrect  positioning  of  the 
location  icon. 

Quality  Assessment:  Temporal  Accuracy 

Zook  et  al.  (2010), 45  as  well  as  Goodchild  and  Glennon  (2010), 46  review  the 
use  of  crowdsourced  geospatial  data  during  natural  disasters,  where  the 
primary  focus  is  on  rapid  data  collection.  Goodchild  and  Glennon’s  discus¬ 
sion  of  the  community  mapping  efforts  during  the  California  wildfires  and 
Zook  et  al.’s  discussion  of  similarly  rapid  mapping  efforts  during  the  Hai¬ 
tian  earthquake,  contrast  with  the  much  longer  production  processes  for 
authoritative  data.  These  two  paradigms  are  compared  in  Chapter  2  of 
Rice  et  al.  (20i2a).47  Zook  (2010),  in  particular,  notes  the  value  in  com¬ 
bined  or  hybrid  uses  of  CGD  and  authoritative  data  used  during  the  Hai¬ 
tian  earthquake. 

Many  of  the  devices  used  for  crowdsourced  geospatial  data  capture 
(smartphones,  tablets,  GPS,  cameras,  etc.)  have  the  ability  to  capture  time, 
and  an  acquisition  time-date  stamp  is  often  embedded  within  the  data. 
Temporal  quality  in  geospatial  data  is  related  to  the  accuracy  of  time 
measurements  contained  in  the  data,  and  importantly  (from  the  perspec¬ 
tive  of  CGD),  the  update  frequency  for  the  dataset.  Update  frequency  is 


45  Zook  et  al.,  “Volunteered  Geographic  Information  and  Crowdsourcing  Disaster  Relief.” 

46  Michael  F.  Goodchild  and  J.  Alan  Glennon,  “Crowdsourcing  Geographic  Information  for  Disaster  Re¬ 
sponse:  A  Research  Frontier,”  International  Journal  of  Digital  Earth  3,  no.  3  (September  2010):  231- 
41,  doi:10. 1080/1753894 1003759255. 

47  Rice  et  al.,  Crowdsourced  Geospatial  Data:  A  Report  on  the  Emerging  Phenomena  of  Crowdsourced 
and  User-Generated  Geospatial  Data.  P.  7-18 
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important  for  CGD,  due  to  the  speed  with  which  CGD  can  be  collected.  In 
past  decades,  authoritative  geospatial  data  production  cycles  could  take 
years  and  typically  ended  with  a  paper  map  printed  on  a  specific  date.  The 
production  cycles  for  CGD  are  more  continuous  in  nature,  characterized  by 
frequent  updates  and  immediate  availability  over  computer  networks. 
During  three-month  period  in  2009,  Girres  et  al.48  noted  a  31.7%  increase 
in  OSM  features,  representing  260,000  objects.  For  France,  they  noted  a 
positive  linear  relationship  between  the  number  of  contributors  present, 
the  number  of  objects  in  OSM,  and  the  frequency  of  updates,  validating 
the  Linus’  Law49  concept  for  CGD  noted  in  a  separate  publication  by 
Haklay  (2010). s° 

For  our  approach  to  temporal  accuracy,  we  are  interested  not  just  in  the 
accuracy  of  individual  time  measurements  associated  with  observation 
and  report  submission  times,  but  also  in  the  elapsed  time  between  the 
start  and  end  of  an  obstacle  or  event  “lifespan”,  which  is  a  more  significant 
aspect  for  the  transient  obstacles  and  events.  Future  efforts  will  focus  on 
identifying  the  precision  for  estimates  of  start  and  stop  times  of  transient 
obstacles. 

Quality  Assessment:  Attribute  Accuracy 

Attributes,  in  a  geospatial  sense,  are  the  non-spatial  data  linked  to  a  loca¬ 
tion.  Attributes  describe  the  characteristics  of  a  geospatial  feature  and  can 
include  anything  from  measureable  characteristics,  like  length  and  width, 
to  descriptive  characteristics,  like  ownership  or  land  cover.  According  to 
Girres  et  al.  (2010),  attribute  accuracy  “assesses  the  accuracy  of  quantita¬ 
tive  attributes,  the  correctness  of  non-quantitative  attributes  and  the  clas¬ 
sification  of  features. ’’s* 1 

Our  GMU  Geocrowdsourcing  Testbed  does  not  ask  contributors  for  direct 
measurements  or  assessments  of  the  quantitative  attributes  of  an  obstacle, 
but  it  does  request  that  users  provide  estimates  of  duration  and  urgency, 


48  Girres  and  Touya,  “Quality  Assessment  of  the  French  OpenStreetMap  Dataset." 

49  Eric  S.  Raymond,  "Release  Early,  Release  Often,”  The  Cathedral  and  the  Bazaar,  08/02, 
http://www.catb.org/esr/writings/homesteading/cathedral-bazaar/ar01s04.html. 

50  Mordechai  (Muki)  Haklay  et  al.,  “How  Many  Volunteers  Does  It  Take  to  Map  an  Area  Well?  The  Validity 
of  Linus'  Law  to  Volunteered  Geographic  Information,”  Cartographic  Journal,  The  47,  no.  4  (November 

1,  2010):  315-22,  doi:10.1179/000870410X12911304958827. 

51  Girres  and  Touya,  “Quality  Assessment  of  the  French  OpenStreetMap  Dataset.”  p.440. 
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both  of  which  involve  ordinal  category  selections,  and  a  categorical  selec¬ 
tion  of  an  obstacle  type,  which  is  a  descriptive  characteristic. 

Feature  naming  in  geospatial  datasets  is  a  difficult  area  for  quality  assess¬ 
ment,  due  to  the  lack  of  universally  accepted  naming  conventions.  Girres 
and  Touya  analyzing  the  names  assigned  to  lakes  in  OSM  and  comparing 
them  to  names  recorded  in  BD  Topo®,  produced  by  the  French  National 
Institute  of  Geographic  and  Forestry  Information  (IGN),  Girres  et  al. 
found  that  55%  of  the  lake  names  matched.  A  main  finding  in  the  Girres  et 
al.  study,  reflecting  the  concept  of  Linus’  Law,  is  that  the  more  contribu¬ 
tors  there  are  for  a  given  area,  the  better  the  quantitative  attribute  accura¬ 
cy.  They  suggest  a  linear  relationship  between  the  number  of  quantitative 
tags  recorded  for  data  for  a  given  area  and  the  number  of  contributors. 
Haklay  et  al.  (2010)  s2  suggest  a  similar  dynamic  with  regard  to  positional 
accuracy  of  features  in  OSM. 

Errors  due  to  misclassification  and  incorrect  attribute  values  are  common 
in  CGD.  If  an  attribute  specification  is  available,  this  problem  maybe  due 
to  the  contributor’s  inability  to  correctly  assign  the  appropriate  attribute. 
In  some  cases,  assignment  of  an  appropriate  attribute  value  may  be  sub¬ 
ject  to  interpretation,  where  even  experts  might  disagree.  In  other  cases, 
attribute  accuracy  problems  may  be  due  to  a  lack  of  expertise  on  the  part 
of  the  contributor,  who  may  lack  the  technical  background  and  experience 
required  to  understand  and  assign  an  appropriate  value. 

Other  Quality  Assessment  Considerations 
Completeness 

As  noted  in  Girres  et  al.  (2010), 53  completeness  measures  the  absence  of 
features  (omissions)  in  a  dataset,  and  the  existence  of  superfluous  features 
(commissions)  in  a  dataset.  Completeness  is  often  discussed  in  the  context 
of  a  dataset’s  specification,  which  is  the  selection  criteria  and  expected  lev¬ 
el  of  detail  at  a  specific  scale.  CGD  projects  often  lack  a  specification  at 
their  outset,  and  therefore  it  is  difficult  to  determine  completeness.  Cover¬ 
age,  which  describes  a  different  but  related  aspect  of  quality,  assesses  the 
presence  and  density  of  features  found  in  an  area.  Coverage  can  be  as¬ 
sessed  without  a  specification  by  comparing  a  dataset  with  an  authorita- 


52  Haklay  et  al.,  “How  Many  Volunteers  Does  It  Take  to  Map  an  Area  Well?’’ 

53  Girres  and  Touya,  "Quality  Assessment  of  the  French  OpenStreetMap  Dataset. 
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tive  source  at  the  same  general  scale  and  same  level  of  detail  (Haklay  2010, 
Girres  et  al.  2010).  Haklay  (2010)54  assessed  the  coverage  of  roads  in  OSM 
and  determined  that  they  had  69%  coverage  in  comparison  with  the  au¬ 
thoritative  datasets.  A  2008  study  by  the  same  author  noted  much  higher 
coverage  in  affluent  areas  (76.6%)  than  in  poor  areas  (46.i%).55 

Rice  et  al.  2013  assessed  the  initial  coverage  of  the  GMU  Geocrowdsourc¬ 
ing  Testbed  (2013,  38)  noting  four  conspicuous  data  voids  in  the  West 
Campus,  Mason  Inn,  Patriot  Center,  and  Masonvale  areas  (Figure  6).  Our 
engagement  with  neighboring  jurisdictions  Fairfax  City  and  Fairfax  Coun¬ 
ty  has  resulted  in  an  expanded  area  of  interest  and  new  data  voids  (Figure 
7),  which  are  being  assessed  on  a  weekly  basis  to  increase  our  coverage. 
Data  voids  could  be  due  to  the  lack  of  obstacles  or  lack  of  observations  in 
an  area.  In  our  case,  we  believe  the  voids  are  due  to  lack  of  observations  in 
those  areas. 
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Figure  6.  Geographic  reporting  voids  (2013) 


54  Haklay,  “How  Good  Is  Volunteered  Geographical  Information?" 

55  ibid. 
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Figure  7.  Geographic  reporting  voids  (2014) 

Malicious  and  Mischievous  Content 

Wikipedia  is  the  most  popular  reference  website  in  the  world,  and  one  of 
the  most  targeted,  with  respect  to  malicious  content  and  vandalism.  OSM, 
the  most  widely  used  geocrowdsourcing  resource,  has  similar  problems 
with  malicious  and  mischievous  content.  Both  resources  have  developed 
extensive,  automated  tools  to  detect  unusual  patterns  and  transactions 
that  are  out  of  the  ordinary,  in  an  effort  to  reduce  malicious  and  mischie¬ 
vous  content.  Although  Rice  (2001,  2005)s6’57  notes  some  significant  ex¬ 
ceptions  with  regard  to  cartographic  copyright  traps,  false  content  in  geo¬ 
spatial  data  can  reduce  the  utility  of  CGD  and  the  perceived  quality. 
Malicious  and  mischievous  content  also  discourages  large,  publicly  ex- 


56  Matthew  T.  Rice,  “Strategies  for  Robust  Digital  Cartographic  Steganography,”  in  Proceedings,  The 
20th  International  Cartographic  Conference,  ICC2001,  Beijing,  China,  August,  2001,  1156-64. 

57  Matthew  T.  Rice,  “Intellectual  Property  Control  for  Maps  and  Geographic  Data”  (Ph.D.  Dissertation, 
University  of  California,  2005). 
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posed  organizations  from  using  CGD  or  related  techniques,  due  to  the  le¬ 
gal  liabilities  and  potential  embarrassment.  Rice  et  al.  (20i2a)s8  discusses 
this  topic  in  more  detail.  For  this  project,  the  two  items  of  potential  mali¬ 
cious  and  mischievous  content  are  inappropriate  or  unauthorized  image 
content,  and  profanity,  both  of  which  would  reflect  negatively  on  the  au¬ 
thoritative  partners  and  project  staff.  At  this  point  we  are  not  seeing  either 
of  these  items  in  our  contributions,  but  have  technical  safeguards  for  pro¬ 
fanity  detection,  which  are  discussed  later  in  this  chapter. 

Logical  Consistency 

Logical  consistency  refers  to  the  use  of  tests  for  validity  of  CGD,  and  in¬ 
cludes  items  such  as  common  digitizing  errors  (undershoots,  overshoots, 
sliver  polygons,  etc.),  topological  errors,  such  as  unconnected  network 
segments  or  segments  that  do  not  properly  intersect,  as  well  as  data  values 
that  are  out  of  range.  Longley  et  al.  (2011,  240)  contains  a  useful  summary 
of  the  common  topological  errors.  OSM  has  developed  some  automated 
tools  for  identifying  topological  errors  in  their  data,  and  researchers  Good- 
child  and  Li  (2012)  recommend  a  geographic  rules-based  approach  for  de¬ 
termining  the  validity  of  CGD.59  This  rules-based  approach  for  addressing 
logical  consistency  is  also  discussed  in  Rice  et  al.  (2012a,  73-75).  As  dis¬ 
cussed  later  in  this  chapter,  assessing  logical  consistency  for  the  data  con¬ 
tributed  to  our  system  consists  primarily  of  a  check  for  valid  data  values 
during  the  reporting  process.  The  underlying  geospatial  data  used  for  rout¬ 
ing  in  our  testbed  is  checked  for  logical  errors  with  inspection  of  under¬ 
shoots,  overshoots,  and  overlapping  features  being  the  primary  focus. 

Risk  and  Fitness  for  Use 

For  geospatial  information,  the  weight  and  consideration  given  to  quality 
is  also  often  based  on  the  risks  associated  with  its  use.  If  quality  is  known, 
the  user  can  weigh  the  risk  of  use  and  reason  through  scenarios  where  er¬ 
rors  could  occur.  Goodchild  and  Glennon’s  discussion  of  the  crowdsourc¬ 
ing  dynamics  during  the  Santa  Barbara  wildfires  describes  this  dilemma.60 
As  the  wildfire  moved  through  the  Santa  Barbara  area  and  the  neighbor¬ 
hoods  evacuated,  residents  had  to  carefully  weigh  the  risk  of  elective  evac- 


58  Rice  et  al.,  Crowdsourced  Geospatial  Data:  A  Report  on  the  Emerging  Phenomena  of  Crowdsourced 
and  User-Generated  Geospatial  Data. 

59  Michael  F.  Goodchild  and  Linna  Li,  “Assuring  the  Quality  of  Volunteered  Geographic  Information,” 
Spatial  Statistics  1  (May  2012):  110-20,  doi:10.1016/j.spasta.2012.03.002. 

60  Goodchild  and  Glennon,  "Crowdsourcing  Geographic  Information  for  Disaster  Response.” 
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uation,  with  its  extreme  stress,  discomfort,  and  dislocation,  with  the  risk  of 
staying  in  place  (possible  injury  or  death).61  The  use  of  crowdsourced  in¬ 
formation  in  this  scenario  and  others  often  involves  a  rapid  assessment 
about  the  dangers  of  accepting  asserted  information.  While  in  statistical 
science  this  assessment  is  contained  within  the  probabilistic  domain  of  a 
significance  test  and  type  I  and  type  II  errors,  in  most  scenarios  the  as¬ 
sessment  is  done  through  instinct,  experience,  and  trust.  Because  we  mod¬ 
erate  all  obstacle  reports  to  our  system,  we  consider  the  risk  for  possible 
harm  from  this  information  to  be  very  low. 

Methods  and  alternatives  for  quality  assessment 

A  foremost  concern  about  the  use  of  crowdsourced  geospatial  data,  as  not¬ 
ed  in  Rice  et  al.  2012a,62  is  quality.  Goodchild  and  Li  (2012)63  identify 
three  principal  methods  for  quality  assurance: 

1.  The  crowdsourced  approach,  based  on  Linus’  Law  where  the 
regular  contributors  and  public  at  large  will  find  and  correct  errors. 
Goodchild  and  Li  (2012, 114)  suggest  this  approach  works  well  for 
prominent  geographic  features  but  not  as  well  for  obscure  ones, 
which  gather  fewer  “eyes”  to  catch  and  correct  errors.  Very  large 
projects  such  as  OSM  with  an  active  user  base  can  make  this  ap¬ 
proach  work.6! 

2.  The  social  approach  relies  on  a  hierarchal  structure  of  trusted 
individuals  to  act  as  moderators  and  gatekeepers. 6s  The  moderators 
tend  to  be  the  contributors  with  the  most  experience  and  history  of 
contributions.  Characteristic  of  this  approach,  Mooney  and  Corco¬ 
ran  noted  the  same  asymmetric  contribution  patterns  seen  in  Wik¬ 
ipedia,  where  a  very  small  proportion  of  the  user  base  contributes  a 
majority  of  the  edits.66’6?’68 


si  Ibid. 

62  Rice  et  al.,  Crowdsourced  Geospatial  Data:  A  Report  on  the  Emerging  Phenomena  of  Crowdsourced 
and  User-Generated  Geospatial  Data. 

63  Goodchild  and  Li,  “Assuring  the  Quality  of  Volunteered  Geographic  Information.” 

64  ibid.  P.  114. 

ss  ibid. 

66  Rice  et  al.,  Crowdsourcing  to  Support  Navigation  for  the  Disabled:  A  Report  on  the  Motivations,  De¬ 
sign,  Creation  and  Assessment  of  a  Testbed  Environment  for  Accessibility. 

B1  P.  Mooney  and  P.  Corcoran,  “Accessing  the  History  of  Objects  in  OpenStreetMap,”  in  Proceedings  of 
the  14th  AGILE  International  Conference  on  Geographic  Information  Science,  Utrecht,  The  Nether¬ 
lands,  Eds:  Stan  Geertman,  Wolfgang  Reinhardt  and  Fred  Toppen  P,  vol.  141,  2011. 

ss  Peter  Mooney  and  Padraig  Corcoran,  "Using  OSM  for  LBS  -  An  Analysis  of  Changes  to  Attributes  of 
Spatial  Objects,”  in  Advances  in  Location-Based  Services,  ed.  Georg  Gartner  and  Felix  Ortag,  Lecture 
Notes  in  Geoinformation  and  Cartography  (Springer  Berlin  Heidelberg,  2012),  165-79, 
http://dx.doi.org/10.1007/978-3-642-24198-7_ll. 
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3.  The  geographic  approach,  where  geocrowdsourced  contribu¬ 
tions  are  matched  against  known  geographic  facts  and  known  geo¬ 
graphic  context  in  which  the  facts  occur.  Inconsistencies  emerge 
when  asserted  contributions  conflict  with  known  principles  and 
rules. 

Our  Approach  to  Quality  Assessment 

The  GMU  Geocrowdsourcing  Testbed  relies  on  the  social  approach  for 
quality  assessment,  using  a  small  team  of  experienced  moderators  to 
check,  validate,  and  provide  ground  truth  for  all  reports  contributed  to  our 
system. 

As  described  in  the  previous  chapter,  the  reports  contributed  to  our  system 
receive  a  comprehensive  quality  assessment,  encompassing  all  of  the 
critical  elements  of  the  “atomic  view”  of  geographic  information,  discussed 
in  Longley  et  al.  (2011),  where  geographic  data  is  composed  of  three 
components:  location,  time,  and  attribute.  Our  moderators  check  and 
assess  these  elements  and  produce  quality  assessment  metrics  for  position, 
time,  and  attribute.  The  quality  assessment  metrics  are  combined  into  a 
single  quality  assessment  score  that  provides  a  comprehensive  metric  for 
each  report.  The  next  step  in  our  system  is  the  generation  of  obstacles, 
which  involves  identifying  any  clusters  of  reports  associated  with  the  same 
transient  event,  and  using  them  to  create  an  obstacle.  The  most  frequent 
pattern  in  our  current  system  is  to  generate  an  obstacle  from  a  single 
report,  and  this  involves  using  the  characteristics  of  the  report,  including 
its  quality  assessment,  as  the  default  attributes  for  the  obstacle. 

Although  our  moderation-based  approach  reduces  the  risk  of  erroneous 
reports,  it  is  resource  intensive.  It  requires  daily  time  and  effort,  and  the 
regular  attention  of  five  students,  who  validate  and  check  reports  in  the 
field  and  then  provide  the  quality  assessment.  A  project  consultant  and 
subject  matter  expert  suggested  that  if  the  financial  resources  that  support 
moderation  activities  are  reduced  for  any  reason,  we  consider  implement¬ 
ing  alternative  strategies  for  quality  assessment,  such  as  those  mentioned 
by  Goodchild  and  Li. 

What  would  be  required  to  switch  to  another  method  of  quality  assess¬ 
ment?  Clearly,  we  would  benefit  from  a  greatly  expanded  community  of 
contributors,  which  could  be  recruited  through  the  same  type  of  social  ac¬ 
tivities  (mapping  parties)  that  has  become  an  important  part  of  OSM’s 
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success.  Harnessing  the  altruism  and  social  rewards  associated  with  con¬ 
tribution  has  been  shown  to  be  effective  in  similar  crowdsourcing  projects 
as  noted  by  Borst  (2010),  Coleman  et  al.  (2010),  Rogstadius  et  al.  (2011), 
and  Zhang  et  al.  (2006).69>70,7i,72  In  the  future,  with  a  much  larger  contrib¬ 
utor  base  and  with  sufficient  interaction  from  authoritative  elements,  we 
will  be  able  to  change  our  approach  and  follow  the  general  advice  con¬ 
tained  in  Goodchild  and  Li  (2012)  by  having  some  of  this  information  gen¬ 
erated  by  other  contributors. 

The  development  of  our  moderation  workflows  and  processes  is  addressed 
in  Rice  et  al.  (2013)73  Paez  (2014)74  and  Pease  (2014)75  Using  the 
framework  of  our  quality  assessment  sub-data  model  and  associated  work- 
flows  outlined  in  Rice  et  al.,  ?6  our  team  of  moderators  perform  a  set  of  dai¬ 
ly  tasks  to  ensure  reports  contributed  to  our  system  are  reviewed  and 
checked  for  quality.  These  tasks  will  be  reviewed  in  the  context  of  quality 
assessment  parameters  discussed  in  the  previous  section. 

Moderating  position 

The  reported  location  of  an  obstacle  in  our  GMU  Geocrowdsourcing 
Testbed  is  currently  determined  by  placement  of  a  locator  icon,  which  can 
be  click-dragged  around  the  map  (Figure  8).  A  contributor  positions  the 
icon  relative  to  familiar  buildings  and  features,  and  can  reposition  the  icon 
during  the  report  submission  process.  The  positional  accuracy  of  the  re¬ 
port  depends  on  both  the  knowledge  of  the  location  of  the  obstacle  and  the 
ability  of  the  contributor  to  place  the  icon  on  the  intended  location.  The 


69  Irma  Borst,  “Understanding  Crowdsourcing:  Effects  of  Motivation  and  Rewards  on  Participation  and 
Performance  in  Voluntary  Online  Activities"  (PhD  Series,  Erasmus  University  Rotterdam,  2010). 

70  D.  Coleman,  B.  Sabone,  and  J.  Nkhwanana,  “Volunteering  Geographic  Information  to  Authoritative 
Databases:  Linking  Contributor  Motivations  to  Program  Characteristics,"  G eomatica  64  (2010):  27- 
40. 

71  J.  Rogstadius  et  al.,  “An  Assessment  of  Intrinsic  and  Extrinsic  Motivation  on  Task  Performance  in 
Crowdsourcing  Markets,"  in  Proceedings  of  the  Fifth  International  AAAI  Conference  on  Weblogs  and 
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future  use  of  GPS-derived  coordinates  through  a  mobile  contribution  in¬ 
terface  will  mitigate  the  problems  with  icon-based  positioning,  as  dis¬ 
cussed  previously  with  selected  reports  showing  high  positional  errors. 
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Figure  8.  Locator  icon  for  positioning  reports 


During  the  moderation  process,  moderators  perform  a  field  check  of  the 
submitted  report  location  and  provide  an  updated  or  corrected  position 
using  the  same  tools.  Latitude  and  longitude  values  are  recorded  for  the 
original  report  positioning  and  the  moderator’s  corrected  positioning,  and 
the  distance  in  meters  between  the  two  positions  is  calculated  using  spher¬ 
ical  formulas  and  stored  in  a  field  titled  qa:positional_accuracy. 

As  noted  previously,  the  reports  contributed  to  our  system  are  moderated, 
quality  assessed,  and  then  used  to  create  obstacles,  which  inherit  the 
quality  measures  of  the  source  report(s),  including  positional  accuracy. 
Figure  9  shows  the  distribution  of  positional  accuracy  statistics  for  obsta¬ 
cles  in  our  system. 

The  positional  accuracy  for  reports  and  obstacles  in  our  system  is  at  the 
present  time  only  visible  to  the  moderators  and  project  staff.  The  modera¬ 
tor’s  “ground  truth”  for  position  replaces  the  contributor’s  estimate  for  re¬ 
port  position,  but  the  difference  between  the  two  values  is  stored  and  the 
original  values  are  retained.  As  a  future  extension  of  our  work,  we  will  be 
analyzing  the  expertise  of  our  moderators  to  establish  a  “ground  truth”  po- 
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sition,  which  will  provide  a  realistic  estimate  for  our  lower  bound  on  posi¬ 
tional  error. 


Horizontal  Positional  Accuracy 
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Figure  9.  Horizontal  Positional  Accuracy  for  obstacles  in  our  testbed  (in 
meters) 

A  quality  assessment  statistic  titled  qadocation  is  generated  by  performing 
a  two-step  inverse  transformation  of  the  positional  accuracy  field.  This 
procedure  scales  the  positional  accuracy  field  to  values  between  o  and  l. 
Reports  that  are  (nearly)  perfectly  positioned  relative  to  the  moderator’s 
ground  truth  (with  a  positional  accuracy  value  between  o  and  l)  receive  a 
value  of  l  for  qadocation.  Reports  with  a  positional  accuracy  value  greater 
than  l  receive  a  simple  inverse  transformation.  In  this  case,  a  report  with  a 
positional  accuracy  of  loom  would  be  inverse  transformed  and  receive  a 
value  of  o.oi.  Figure  to  shows  the  corresponding  inverse  transformed  val¬ 
ues  shown  in  Figure  9.  Positional  accuracy  and  qadocation  figures  are  cal¬ 
culated,  stored,  and  retained  for  every  report  and  obstacle  in  our  testbed. 
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Horizontal  Positional  Accuracy: 
Inverse  Transformed 
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Figure  10.  Inverse  transformed  horizontal  positional  accuracy  for  obstacles 

Future  quality  statistics  related  to  position  will  incorporate  embedded  geo¬ 
tags  and  azimuth  information  from  submitted  images,  spatial  footprints 
from  geoparsed  location  text,  and  positioning  derived  from  mobile  device 
GPS. 

Moderating  Temporal  Consistency 

Every  report  submitted  to  our  system  has  two  primary  temporal  character¬ 
istics:  the  time  of  report  submission,  captured  directly  by  code  in  the 
testbed,  and  the  time  of  observation,  which  is  selected  directly  by  the  user 
through  a  JavaScript  time/date  picker  that  defaults  to  the  contributor’s 
current  time.  The  difference  in  time  is  captured  and  stored  as  a  variable 
with  values  of  l  (for  a  difference  less  than  24  hours)  and  o  (more  than  24 
hours),  and  stored  in  a  field  titled  qa:temporal_consistency.  The  choice  of 
values  (0,1)  for  the  qa:temporal_consistency  statistics  (and  other  quality 
statistics)  is  done  to  facilitate  the  creation  of  composite  numeric  quality 
statistic.  As  with  all  of  the  other  quality  assessment  items  discussed  in  this 
chapter,  all  original  temporal  characteristics  are  preserved  along  with  the 
derived  quality  assurance  measures  so  that  future  modifications  to  our 
quality  assessment  process  can  be  made,  with  values  calculated  or  recalcu¬ 
lated  automatically. 


25 


Moderating  attributes:  Location  description 

Contributors  to  our  system  are  asked  to  provide  a  text-based  description  of 
the  obstacle’s  location  being  reported.  This  description  allows  moderators 
to  locate  and  verily  reports,  and  provides  a  way  of  checking  the  consisten¬ 
cy  of  position  reporting.  Moderators  are  responsible  for  correcting  obvious 
misspellings  and  mistakes,  but  otherwise  this  field  is  left  intact.  The  mod¬ 
erators  do,  however,  provide  a  separate  location  description  of  their  own, 
based  on  the  original  report  location  description.  This  moderated  version 
of  the  location  description  is  also  stored  with  the  report.  Development  of  a 
detailed  gazetteer  and  associated  geoparsing  capability  (discussed  in  Rice 
et  al.  2012b)  will  allow  us  to  provide  real-time  footprints  for  text-based  lo¬ 
cation  descriptions.  The  presence  of  location  text  for  a  report  is  treated  as 
a  Boolean  value  and  stored  in  a  quality  assessment  field  titled 
qa:location_text. 

Moderating  attributes:  Obstacle  type 

Contributors  to  our  system  tag  their  reports  with  an  obstacle  type  using  a 
multi-selection  menu  (Figure  11).  Possible  obstacle  types  are  sidewalk  ob¬ 
struction,  construction  detour,  entrance/exit  problem,  poor  surface  condi¬ 
tion,  crowd/event,  and  other.  Moderators  verily  obstacle  type  using  a  field 
check  and  provide  a  moderator’s  version  of  the  obstacle  type. 
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Obstacle  Details 


Obstacle  Type*  © 


sidewalk  obstruction 


construction  detour 
entrance/exit  problem 
poor  surface  condition 

Figure  11:  Obstacle  type  selection 


This  is  stored  in  a  field  titled  mod:obstacle_type  and  the  categorical 
matching  between  the  contributor’s  obstacle  type  and  the  moderator’s  ob¬ 
stacle  type  is  stored  as  three  possible  values  (o  =  no  match,  1  =  partial 
match,  2  =  exact  match)  in  a  field  titled  qa:obstacle_type.  The  choice  of 
values  for  this  statistic  (0,1,2)  is  not  based  in  theory  but  rather  for  the 
creation  of  a  composite  numerical  quality  statistic. 
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Moderating  attributes:  Obstacle  description 

Contributors  provide  a  text-based  description  of  each  obstacle  they  report. 
Because  the  obstacle  description  text  is  publicly  visible  and  disseminated 
through  our  website,  this  text  field  is  checked  for  profanity  and  errors,  and 
if  necessary,  corrected  by  moderators.  If  this  field  is  edited,  moderators 
are  instructed  to  provide  a  concise,  80-140  character  description  of  the  ob¬ 
stacle,  which  is  then  stored  in  the  moderator’s  obstacle  description  field. 

Moderating  attributes:  Obstacle  duration 

Contributors  provide  their  best  estimate  for  how  long  a  particular  obstacle 
will  be  present.  This  duration  estimate  is  selected  from  a  menu  with  op¬ 
tions  Short  (<1  day),  Medium  (1-7  days),  and  Long  (>7  days),  as  seen  in 
Figure  12. 
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Figure  12.  Obstacle  duration  selection 


This  duration  estimate  is  checked  by  moderators  and  adjusted  if  neces¬ 
sary.  The  moderator’s  estimate  for  duration  is  stored  in  a  separate  field 
and  a  quality  assessment  statistic  titled  qa:  duration  is  calculated  and 
stored  as  an  integer  to  reflect  the  quality  of  the  match  between  the  con¬ 
tributor’s  and  moderator’s  estimate  for  this  ordinal-level  variable.  The 
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field  qa:  duration  has  a  value  of  o  for  no  match,  l  for  a  neighboring  value 
match,  and  2  for  an  exact  match. 


Moderating  attributes:  Obstacle  urgency 

A  key  field  in  our  GMU  Geocrowdsourcing  Testbed  is  an  obstacle’s  urgen¬ 
cy,  which  is  intended  to  be  a  reflection  of  the  contributor’s  and  modera¬ 
tor’s  assessment  of  how  serious  the  obstacle  is  and  what  possible  safety 
issues  or  danger  it  presents.  Contributors  select  their  best  estimate  for  ur¬ 
gency  using  a  pull-down  menu  (Figure  13),  which  has  values  for  Low  (rep¬ 
resenting  a  mere  inconvenience),  Medium  (a  moderate  inconvenience  and 
a  possible  safety  hazard),  and  High  (a  significant  inconvenience  and  a  sig¬ 
nificant  safety  hazard). 
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Figure  13:  Obstacle  urgency  selection 


Moderators  are  asked  to  carefully  check  the  urgency  estimate  provided  by 
the  report  contributor,  and  asked  to  provide  an  estimate  of  their  own, 
based  on  standards  decided  upon  collectively  by  the  moderators.  Similar 
to  qa: duration,  the  match  between  the  contributor’s  urgency  estimate  and 
the  moderators  urgency  estimate  is  calculated  and  stored  in  a  field  titled 
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qamrgency.  As  with  qa:duration,  qa:urgency  receives  a  value  of  o  for  no 
match,  l  for  a  neighboring  category  match,  and  2  for  an  exact  match.  Fu¬ 
ture  work  will  be  done  to  assess  the  consistency  of  individual  moderators 
in  providing  this  attribute  assessment  as  well  as  all  other  moderator-based 
decisions  used  for  quality  assessment. 


Moderating  attributes:  Images,  feedback,  and  comments 

Reports  contributed  to  our  system  usually  have  images  attached  to  them. 
Images  are  a  very  useful  component  of  a  report  due  to  their  use  in  verifica¬ 
tion  of  the  obstacle  and  its  location,  and  in  many  cases,  disambiguation  of 
the  text-based  obstacle  description.  Moderators  check  these  images  to  en¬ 
sure  that  they  are  appropriate  and  relevant.  The  quality  of  the  images  is 
assessed  using  established  guidelines  (Table  1)  and  a  quality  value  and 
numerical  score  are  assigned.  A  quality  assessment  statistic 
qa:image_quality  is  calculated  and  stored  with  the  report.  As  with  other 
moderator-derived  quality  statistics,  future  work  will  be  done  to  assess  the 
consistency  of  the  individual  moderators  in  making  this  assessment. 

Table  1.  Moderator  Assessment  Guidelines  for  Image  Quality 


Guidelines  for  Image  Quality  Assessment 

Quality  Rank 

Score 

Description 

Missing 

0 

Image  was  not  provided 

Low 

1 

Image  has  multiple  issues 

Photo  does  not  fully  encapsulate  obstacle  and  does  not 
provide  a  reference  point  for  obstacle’s  location,  or  pho¬ 
to  is  blurry  or  of  otherwise  low  quality  that  hinders  abil¬ 
ity  to  tell  what  the  obstacle  is  or  where  it  is  located 

Medium 

2 

Image  may  have  one  issue  but  is  useful  has  over¬ 
all  quality  is  good 

Photo  may  be  missing  a  reference  point  to  determine  lo¬ 
cation  (too  zoomed  in,  no  buildings  in  background), 
photo  may  be  taken  at  night  so  it  is  difficult  to  see  ob¬ 
stacle  detail,  or  photo  does  not  fully  encapsulate  obsta¬ 
cle 

High 

3 

Image  has  no  barring  issues 

Photo  was  taken  during  the  day,  not  blurry,  provides  a 
reference  point  (such  as  a  building),  fully  encapsulates 
obstacle 
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Moderators  also  check  the  feedback  and  comments  for  each  report,  and 
can  attach  their  own  internal  comments,  which  are  saved  with  the  report 
and  viewable  internally  by  all  moderators  and  project  staff.  Feedback  and 
comments  are  not  required  elements  for  report  submission  and  are  not 
included  in  the  obstacle  quality  score  calculations. 

Moderating  malicious  content 

An  automated  PHP-based  scripting  tool  searches  all  text-based  fields  in 
each  report  for  profanity,  and  flags  the  reports  that  contain  any  entries 
from  a  list  of  offensive  words  and  terms.  This  list  is  based  on  Google’s 
blacklisted  terms  and  has  approximately  2000  entries.  A  Boolean  modera¬ 
tor  flag  called  MODFLAG  is  set  to  TRUE  when  an  offending  term  is  found. 
Report  images,  as  mentioned  previously,  are  checked  for  malicious  or  in¬ 
appropriate  content.  Reports  that  have  MODFLAG  =  TRUE  because  of 
profanity  or  for  other  reasons  are  not  publicly  displayed  unless  a  modera¬ 
tor  fixes  the  problem  and  resets  the  MODFLAG  to  false. 

Moderating  logical  consistency 

As  covered  in  the  previous  section,  logical  consistency  is  an  important  as¬ 
pect  of  quality  assessment  and  has  a  number  of  different  manifestations. 
Girres  (2010)  describes  logical  consistency  (using  wording  from  Servigne 
et  al.  2000)77  as  “the  degree  of  internal  consistency  as  modeling  rules  and 
specifications  (including  compliance  with  integrity  constraints).”  78  Be¬ 
cause  our  application  has  a  specific  geographic  scope,  and  because  we  in¬ 
tend  to  provide  relevant  information  and  services  to  end-users  within  that 
geographic  area,  we  screen  all  incoming  reports  for  consistency.  Reports 
must  fall  within  the  system  boundaries,  which  are  upper  left:  38.861782, 
-77.346539  and  lower  right:  38.812844,  -77.288174.  When  they  fall  out¬ 
side  the  boundary,  a  Boolean  variable  called  boundary_check  is  set  to 
false.  Reports  with  boundary_check  =  false  are  not  displayed  in  the  sys¬ 
tem,  but  can  be  modified  and  fixed  by  moderators.  Moderators  check  the 
validity  of  content  in  all  fields  and  when  necessary,  make  changes. 


77  Sylvie  Servigne  et  al.,  “A  Methodology  for  Spatial  Consistency  Improvement  of  Geographic  Data¬ 
bases,”  Geolnformatica  4,  no.  1  (2000):  7-34. 

78  Girres  and  Touya,  "Quality  Assessment  of  the  French  OpenStreetMap  Dataset.”  P.  440. 
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Moderating  completeness 

Reports  that  are  submitted  to  our  system  are  given  a  composite  score 
based  on  the  number  of  fields  that  have  valid  entries.  A  report  that  is  fully 
complete  with  valid  entries  in  every  field  gets  a  score  of  100%,  while  a  re¬ 
port  void  of  content  receives  a  0%.  In  practice,  the  GMU  Geocrowdsourc¬ 
ing  Testbed  enforces  valid  data  entry  in  several  fields  before  submission 
can  be  completed,  so  completeness  scores  below  48%  are  not  possible  for 
successfully  submitted  reports.  Moderators  view  the  completeness  score 
(called  qa:completeness),  but  this  calculation  of  the  score  is  done  automat¬ 
ically  and  stored  with  each  report. 

Moderation:  Summary  measures 

Moderators  use  all  factors  to  assign  a  subject  quality  score  to  each  report, 
qa:  moderate r_quality_score  is  an  ordinal  scale  variable  from  1  (very  low 
quality)  to  5  (very  high  quality),  and  is  made  based  on  written  guidelines, 
training,  and  joint  agreement.  However,  based  on  the  broad  coverage  of 
this  metric  and  its  somewhat  subjective  nature,  there  have  been  concerns 
about  moderator  consistency  in  making  this  assessment.  Assessing  mod¬ 
erator  consistency  is  of  great  interest  as  a  future  subject  or  study. 

In  computing  our  broadest  and  most  all-encompassing  composite  final 
quality  assessment  score  (discussed  below  and  summarized  in  Table  2)  the 
qa:  moderate r_quality_score  was  included  as  a  factor  and  weighted  heavily 
due  to  the  perceived  high  value  of  this  comprehensive  assessment.  Recent 
consideration  has  been  made  for  the  consistency  of  moderator  assess¬ 
ments  for  this  item,  and  the  way  that  this  moderator  score  might  be  con¬ 
tributing  redundant  content  in  our  comprehensive  composite  quality 
scores.  To  look  more  closely  at  this  issue,  we  assessed  the  relationship  be¬ 
tween  this  comprehensive  moderator  score,  and  the  final  quality  score 
metric  with  this  moderator  score  removed,  which  we  refer  to  as  quality  as¬ 
sessment  total  score  or  in  abbreviated  form,  qa:total_score.  We  expected 
to  see  a  strong  positive  relationship  between  these  two  variables.  Figure  14 
shows  the  frequency  of  qa:moderator_quality_score  for  obstacles  in  our 
system,  and  Figure  15  shows  the  relationship  between  this  comprehensive 
moderator  score,  and  the  quality  assessment  total  score.  Figure  15  indi¬ 
cates  that  there  is  no  strong  relationship  between  the 
qa: moderate r_quality_score  and  qa:total_score.  The  dynamics  of  this  re¬ 
lationship  will  be  investigated  in  the  next  phase  of  our  research.  In  future 
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quality  metrics  for  our  testbed  we  will  compute  a  total  quality  metric  both 
with  and  without  the  contribution  of  the  qa:moderator_quality_score. 


Figure  14.  Moderator  Score  Frequency  for  testbed  obstacles 
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Quality  Assessment  Total  Score  by  Moderator  Score 


Moderator  Score 


Figure  15.  QA:Total_Score  by  Moderator  Score 


Moderation:  Generation  of  a  Final  Quality  Score 

Once  all  the  quality  assessment  tasks  are  finished  and  moderators  have 
finished  all  moderation  activities,  a  final  quality  metric  is  calculated.  This 
quality  metric  is  used  with  the  other  key  quality  metrics  to  provide  our 
quality  assessment.  The  final  quality  metric,  titled  qa:final_score,  is  a 
composite  linear  combination  of  all  other  quality  assessment  metrics,  and 
has  a  range  from  o  to  too.  In  practice,  the  quality  scores  for  most  reports 
vary  between  55  and  95.  The  ranking  and  weighting  of  components  for  the 
qa:final_score  is  based  on  the  mutual  expert  assessment  of  the  moderator 
team,  whose  experience  reviewing  reports  leads  them  to  perceive  certain 
quality  assessment  metrics  as  being  more  valuable  than  others.  For  in¬ 
stance,  the  quality  of  positioning  (weighted  at  17%  and  stored  as 
qadocation)  is  perceived  by  the  moderators  to  be  a  better  indicator  of  the 
reports  quality  than  the  quality  assessment  statistics  for  the  estimates  for 
urgency  and  duration,  which  are  weighted  at  12%  and  10%,  respectively 
(Table  2).  The  weighting  formula  shown  in  Table  2  has  changed  several 
times  and  will  continue  to  change  as  we  refine  our  quality  assessment  met- 
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rics  and  discover  which  measures  are  most  indicative  of  high  quality.  As 
noted,  future  quality  assessment  metrics  will  use  a  version  of  a  compre¬ 
hensive  quality  score  that  removes  subjective  moderator  quality  assess¬ 
ments,  which  are  useful  but  may  include  redundancies. 


Table  2.  Quality  Metric  Calculations:  Final  Score 


Quality  Assessment  Variables 

Values 

RANKS 

Weight  (%) 

QA:  Temporal  Consistency 

0,1 

7 

6 

QA:  Location  (X,Y) 

Max  =  1,  Min  =  0 

2 

17 

QA:  Location  text 

0,1 

8 

5 

QA:  Image  Quality 

0,1, 2, 3 

3 

15 

QA:  Obstacle  type 

0,1,2 

5 

11 

QA:  Duration 

0,1,2 

6 

10 

QA:  Urgency 

0,1,2 

4 

12 

QA:  Completeness 

0-100  scaled  to  0-1 

9 

4 

QA:  Moderator  Quality  Score 

1-5 

1 

20 

100 

Moderation:  Generation  of  Obstacles 

After  reports  are  moderated  and  receive  quality  scores,  the  moderator 
team  generates  obstacles  from  the  reports,  using  a  set  of  clustering  tools  to 
select  similar  reports  and  group  them  together.  Each  obstacle  inherits 
characteristics  from  a  template  report,  which  is  selected  by  the  moderators 
and  typically  has  the  highest  quality  score.  When  multiple  reports  are  clus¬ 
tered  and  used  to  generate  an  obstacle,  a  summary  of  the  quality  scores 
from  the  source  reports  is  preserved  along  with  the  complete  quality  as¬ 
sessment  from  the  template  report.  Obstacles  are  then  generated  and  pub¬ 
lished  to  our  website,  which  can  display  reports  and  obstacles  depending 
on  the  viewer’s  preference.  At  present,  we  have  no  criteria  for  thresholding 
reports  and  removing  any  from  consideration.  The  quality  metrics  we  are 
developing  will  lead  us,  during  the  next  phase  of  our  research  to  analyze 
the  influence  of  individual  quality  metrics  and  to  assess  the  causes  of  both 
high  and  low  scores.  We  also  anticipate  that  quality  scores  will  be  a  major 
factor  in  which  reports  are  displayed  and  which  are  hidden  from  view,  in 
areas  and  circumstances  where  multiple  reports  have  been  received  for  the 
same  item. 

Figure  16  and  Figure  17  show  the  qa:final_score  statistics  for  the  55  GMU 
Geocrowdsourcing  Testbed  obstacles  collected  between  May  17,  2013  and 
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June  30,  2014,  which  were  created  from  221  reports.  The  scores  for  obsta¬ 
cles  are  ordered  by  submission  date  in  Figure  16,  and  by  qa:final_score  in 
Figure  17.  As  can  be  seen  in  this  graphic,  quality  scores  for  current  obsta¬ 
cles  vary  between  55%  and  95%.  Our  current  testbed  status  as  of  Septem¬ 
ber  4,  2014  indicates  that  we  presently  have  330  submitted  reports  and  90 
obstacles  generated  from  those  reports. 


Quality  Assessment 
Final  Score 


Figure  16.  Quality  assessment  final  scores  for  obstacles,  sorted  by  submission 

date 
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Quality  Assessment 
Final  Score 


Figure  17.  Quality  assessment  final  scores  for  obstacles,  sorted  by  score 


Later  in  this  report  (Chapter  4)  we  look  at  different  methods  for  visualiz¬ 
ing  information  collected  in  the  GMU  Geocrowdsourcing  Testbed,  includ¬ 
ing  graphics  associated  with  the  metrics  described  in  this  chapter. 

The  next  chapter  of  this  report  (Chapter  3)  looks  at  the  creation  of  a  train¬ 
ing  and  recruitment  program  for  contributors  (Paez  2014)79.  The  signifi¬ 
cance  of  this  research  activity  will  be  summarized  with  respect  to  quality 
assessment  and  the  success  of  the  GMU  Geocrowdsourcing  Testbed. 


79  Paez,  “Recruitment,  Training,  and  Social  Dynamics  in  Geo-Crowdsourcing  for  Accessibility. 
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3  Recruitment  and  Training  in  the  GMU 
Geocrowdsourcing  Testbed 

The  emerging  trend  of  geocrowdsourcing  provides  a  cost-effective  way  to 
capture  data  that  is  often  more  accurate  than  authoritative  data,  as  the  lo¬ 
cal  geographic  knowledge  that  every  individual  carries  with  them  provides 
the  expertise  required  to  contribute.80  However,  innate  geographic 
knowledge  is  often  not  enough.  An  effective  geocrowdsourcing  system  re¬ 
quires  contributors  who  are  trained  and  willing  to  participate.  This  chapter 
focuses  on  the  dynamics  of  participation  and  how  the  recruitment  and 
training  of  participants  in  our  project  influences  quality. 

Origins  of  a  Training  and  Recruitment  System:  Examples  and 

Approaches 

The  work  of  Fabiana  Paez,  published  April  2014  as  a  master’s  thesis,  out¬ 
lined  the  early  phase  of  this  project,  where  a  comprehensive  survey  of  geo¬ 
crowdsourcing  applications  was  conducted.81  These  applications  (re¬ 
viewed  in  the  third  chapter  of  Rice  et  al.  2012a82)  were  categorized  by  the 
activity  involved:  imaging,  georeferencing,  transcribing,  digitizing,  attrib¬ 
uting,  reporting,  searching,  tracking,  validating,  polling/surveying,  social¬ 
izing,  and  sharing.  From  the  nearly  200  geocrowdsourcing  projects  and 
applications  reviewed  by  Paez  and  research  team  member  Brandon  Shore, 
twenty-four  projects  and  applications  were  chosen  for  detailed  analysis, 
based  on  their  prominence  and  ranking  by  project  staff  and  collaborators 
at  GMU  and  ERDC.  From  this  analysis,  Paez  determined  that  more  than 
half  of  the  applications  involved  some  type  of  basic  contribution  tracking, 
20%  of  the  projects  incorporated  a  rating  system  for  users  or  for  contribu¬ 
tions,  14%  included  some  type  of  content  restriction^,  and  26%  of  the  pro¬ 
jects  involved  some  type  of  training  for  contributors  (Figure  18). 


80  Goodchild,  “Citizens  as  Sensors:  The  World  of  Volunteered  Geography”:  Goodchild,  “NeoGeography 
and  the  Nature  of  Geographic  Expertise." 

81  Paez,  “Recruitment,  Training,  and  Social  Dynamics  in  Geo-Crowdsourcing  for  Accessibility.” 

82  Rice  et  al.,  Crowdsourced  Geospatial  Data:  A  Report  on  the  Emerging  Phenomena  of  Crowdsourced 
and  User-Generated  Geospatial  Data. 

83  Waze,  for  instance,  reserves  the  right  in  its  Terms  of  Service  to  restrict  content  or  delete  any  content  it 
deems  inappropriate  and  in  Google  Map's  developer  terms  of  service,  certain  business  listings  are 
subject  to  restrictions  (https://developers.google.com/maps/terms  )  [accessed  Sep.  4,  2014]. 
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Method  for  Rating  system  Content  Training 

tracking  restrictions 

contributions 


Review  Criteria 


Figure  18.  Proportion  of  surveyed  geocrowdsourcing  applications  incorporating 
quality  assessment  characteristics,  from  Paez  (2014).  Used  with  permission. 


Building  on  this  comprehensive  survey  and  analysis,  Paez  more  closely  an¬ 
alyzed  the  user  training  methods  incorporated  in  Waze  (a  popular  social 
traffic  and  navigation  app),84  OSM,8s  The  National  Map  Corps  (TNM 
Corps)  of  the  U.S.  Geological  Survey  (USGS),86  and  Google  Map  Maker.87 


Training:  Waze 

Waze  is  an  extremely  popular  social  traffic  reporting  and  navigation  app, 
with  a  reported  user  base  of  50  million  individuals.88  Google  acquired 
Waze  in  June  2013  for  1.3  billion  dollars,  in  one  of  the  largest  and  most 
notable  high-tech  acquisitions  of  the  year.  The  application  allows  users  to 
receive  turn-by-turn  navigation  via  GPS,  but  with  the  addition  of  location- 
specific  updates  of  traffic  load,  traffic  accidents,  roadblocks,  police  check¬ 
points,  and  fuel  prices.  The  app  also  allows  users  to  report  errors  in  the 
map  database.  Due  to  Waze’s  commercial/consumer  orientation  and  its 


84  “Waze  -  Social  Traffic  &  Navigation  App,”  accessed  June  20,  2012,  http://www.waze.com/. 

85  “OpenStreetMap,”  accessed  April  28,  2012,  http://www.openstreetmap.org/. 

86  “The  National  Map  Corps,”  USGS,  August  2,  2011,  http://nationalmap.gov/TheNationalMapCorps/. 

87  “Google  Map  Maker,”  Google,  accessed  May  1,  2012,  http://www.google.com/mapmaker. 

88  Josef  Federman  and  Max  J.  Rosenthal,  “Waze  Sale  Signals  New  Growth  for  Israeli  High  Tech,"  News, 
Yahoo  News,  (June  12,  2013),  http://news.yahoo.com/waze-sale-signals-growth-israeli-high-tech- 
174533585.html. 
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use  while  driving,  the  interface  is  sparse  and  simple,  and  every  function  is 
achieved  with  a  single  finger  tap  on  an  icon.  There  is  no  training  needed 
for  the  app,  due  to  its  simplicity  and  intuitiveness,  but  there  are  training 
elements  associated  with  the  map  editor  application.  Through  a  few  short 
videos,  Waze  explains  the  purpose  and  functionality  of  the  mobile  applica¬ 
tion,  and  how  to  edit  the  application’s  basemap.  The  basemap  editor  con¬ 
tains  embedded  video,  which  shows  users  step-by-step  directions  for  edit¬ 
ing  while  inside  the  application  (Figure  19). 


Figure  19.  Waze  Map  Editor  training  video  embedded  in  website 
Training:  OpenStreetMap 

Having  a  different  profile  than  Waze  (which  was  characterized  as  a  simple 
tracking  and  reporting  application  in  Paez  2014  and  Rice  et  al.  2012a), 
OSM  is  a  much  larger  and  more  ambitious  application,  or  perhaps  from 
some  perspectives,  a  significant  social  movement  and  the  centerpiece  of 
the  open  source  mapping  world.  Our  previous  report  provided  extensive 
descriptions  of  OSM  from  a  variety  of  perspectives,  but  in  our  Chapter  3 
analyses  and  in  Paez  (2014)  it  was  characterized  as  a  digitizing  and  vali¬ 
dating  application. 

At  its  core,  OSM  is  about  generating  high-quality  basemaps  and  data  for 
use  in  open-source  applications.  OSM  provides  a  large  number  of  tools 
and  resources  for  learning  how  to  edit  its  maps.  After  creating  an  account 
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in  OSM,  users  receive  an  introductory  email  providing  a  list  of  the  re¬ 
sources  where  users  can  find  training  material,  such  as  videos, 89  a  wiki, 9° 
and  a  questions  and  answers  site.91  The  training  methods  used  in  OSM 
vary  in  length  and  complexity  according  to  the  many  ways  to  contribute 
data.  In  early  2014,  OSM  improved  its  training  method  for  editing  its  map, 
which  is  now  embedded  in  its  website  (Figure  2o)92.  The  “Walkthrough” 
link  takes  the  user  to  an  interactive  training  module,  which  shows  the  dif¬ 
ferent  features  available  to  edit  the  maps.  As  Figure  21  shows,  while  com¬ 
pleting  this  optional  training,  a  bar  at  the  bottom  of  the  page  shows  the 
users  the  sections  that  have  been  explained  and  the  ones  that  have  not 
been  reviewed.  This  embedded  training  is  similar  in  functionality  to  Waze. 


Figure  20.  OSM  edit  tool  with  embedded  training 


89  Steve,  “OpenStreetMap,”  S howMeDo,  2008, 
http://showmedo.com/videotutorials/series?name=mS2PlZqS6. 

90  “Beginners’  Guide  -  OpenStreetMap  Wiki,”  July  28,  2013, 
http://wiki.openstreetmap.org/wiki/Beginners%27_Guide. 

91  “OpenStreetMap  Help  Forum,”  accessed  August  26,  2013,  https://help.openstreetmap.org/. 

92  To  view  the  embedded  training  video,  see:  https://www.waze.com/editor/7lon— 
95.54538&lat=36.22089&zoom=0#tutorial-dialog  [accessed  Sep.  4,  2014] 
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Figure  21.  OSM  edit-tool  training  material  with  bar  showing  training  progress 
Training:  The  National  Map  Corps  (USGS) 

TNM  Corps  is  a  program  of  the  USGS,  which  involves  crowdsourcing  data 
collection  applications  where  end-users  can  edit  the  National  Map  data¬ 
base. 93.  It  was  discussed  in  detail  in  our  2012  report94  and  is  an  important 
point-of-reference  in  looking  at  how  one  of  the  most  important  and 
revered  authoritative  government  geospatial  data  producers  is  incorporat¬ 
ing  crowdsourcing  into  its  map  production  workflows. 

The  training,  developed  as  a  part  of  the  USGS  TNM  Corps  program,  teach¬ 
es  contributors  how  to  execute  editing  tasks  and  is  less  interactive  and  dy¬ 
namic  than  the  ones  provided  by  Waze  and  OSM.  However,  TNM  Corps 
training  material  provides  step-by-step,  detailed  instructions  using  simple 
terminology,  concise  information,  and  graphics  of  the  map  editor’s  inter¬ 
face.  Figure  22  shows  some  sections  of  the  TNM  Corps  training  material, 
which  describes  some  of  the  features  in  the  map  editor  interface  through 
static  arrows  and  text. 


93  See  https://mv.usgs.gov/confluence/displav/nationalmapcorps/Home  [accessed  Sep.  4,  2014] 

94  Rice  et  at,  Crowdsourced  Geospatial  Data:  A  Report  on  the  Emerging  Phenomena  of  Crowdsourced  and 
User-Generated  Geospatial  Data. 
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Figure  22.  TNM  Corps  training  material 
Training:  Google  Map  Maker 

Google  Map  Maker  was  profiled  in  our  2012  survey  and  in  Paez  (2014)  as  a 
digitizing  application  in  the  same  category  as  OSM  and  Wikimapia,  which 
suggests  that  a  primary  focus  of  the  application  is  the  generation  of  map 
features.  Its  training  combines  some  features  also  found  in  the  training 
from  Waze  and  OSM.  The  interactive  training  is  embedded  in  the  map  in¬ 
terface,  with  videos  demonstrating  how  to  use  each  feature  of  the  editor 
tool  with  step-by-step  instructions  (Figure  23).  Google  Map  Maker  also 
provides  a  Help  Center,  which  explains  the  project  in  more  detail,  and 
provides  additional  information  such  as  videos  and  tutorials,  community 
resources,  and  troubleshooting  (Figure  24). 
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Figure  24.  Google  Map  Maker  additional  information  and  training  material 


43 


Summary  of  Training  Examples 

Each  of  the  four  examples  from  this  section  is  unique  in  some  way,  but 
there  are  also  strong  similarities.  Based  on  the  success  of  the  applications, 
these  similarities  could  be  considered  best  practices  in  training  for 
geocrowdsourcing.  First,  each  training  example  includes  general 
information  about  the  project  and  its  purpose.  Second,  each  of  the  training 
examples  provides  some  interactive,  animated,  video-based,  or  graphical 
explanation  of  the  applications  functionality.  Projects  with  a  strong 
commercial  motivation  (e.g.  Waze)  tend  to  have  effective,  simple,  and 
well-produced  training  material.  On  the  other  end  of  the  spectrum  is  the 
USGS’  TNM  Corps,  which  has  simple  and  informative  training  content,  but 
with  static  graphics  in  a  much  more  traditional  format. 

The  GMU  Geocrowdsourcing  Testbed  Training 

The  ideas  from  this  short  survey  of  training  material  in  four  popular  geo¬ 
crowdsourcing  applications  have  aided  and  will  continue  to  improve  the 
development  of  our  own  training  materials.  We  are  not  a  commercial  firm 
like  Google,  or  a  commercial  product  like  Waze,  nor  are  we  a  traditional 
and  formal  government  agency.  It  would  not  be  sensible  to  hire  commer¬ 
cial  firms  for  producing  elaborate  video  content,  but  we  have  tried  to  use 
the  best  methods  to  reach  our  university  audience  using  the  tools  and 
technology  available  to  us  and  consistent  with  our  goals  and  funding. 

Our  training  program  has  a  dual  goal:  to  recruit  project  participants,  and 
to  train  our  participants  adequately  enough  to  contribute.  This  dual  goal 
has  been  effective  to  the  extent  that  enough  participants  have  been  re¬ 
cruited  to  generate  more  than  300  reports  (as  of  September  2014),  and 
most  of  those  reports  have  provided  enough  information  about  transient 
obstacles  to  be  useful.  The  training  material  presented  on  the  following 
pages  is  done  in  two  ways:  in  person  through  PowerPoint  and  interactive 
discussion,  and  online  through  Adobe  Presenter  and  recorded  vid¬ 
eo/audio.  Of  the  more  than  200  people  who  have  been  trained,  149  have 
participated  in  in-person  training  using  PowerPoint  material  and  another 
52  have  participated  in  online  training  through  Adobe  Presenter  served 
through  GMU’s  Blackboard  System.  Twenty-eight  individuals  have  con¬ 
tributed  obstacle  reports  to  our  system. 

The  training  material  is  designed  and  structured  to  take  no  more  than  20 
minutes,  consistent  with  design  practices  recommended  by  educators  in 
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GMU’s  Division  of  Instructional  Technology,  and  includes  many  of  the 
same  general  categories  of  information  identified  in  the  previously  profiled 
training  material: 

•  A  general  project  overview  and  statement  of  purpose, 

•  A  graphical  explanation  of  application  functionality,  and 

•  A  short  assessment 

Figure  25  and  Figure  26  show  the  PowerPoint  slides  created  for  the  gen¬ 
eral  project  overview  and  purpose  statement.  They  include  concise  de¬ 
scriptions  of  the  problem  and  purpose  of  the  research  (Figure  25)  and  a 
brief  description  of  crowdsourcing  (Figure  26). 


What  we  do 


Current  GMU  Accessibility  System 

•  Large  community  with  accessibility  requirements 

•  Changing  GMU  environment 

•  No  real-time  data 

•  Small  scale  maps 

•  Static  maps 

Purpose  of  Research 

Design  a  system  to  combine  official  information  with 
geo-crowdsourcing  to  facilitate  accessibility 


1  \ 

*  V*  »*r 

‘  * 

it  1 

\  1 

GEORGE 


UNIVERSITY 


Where  Innovation  Is  Tradition 


Figure  25.  Training  material  -  Introduction  to  the  research  problem  and 

purpose 
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Why  Geo-Crowdsourcing 


Definition 

Local  community  collects  and  share  geo-referenced 
data 

Benefits 

•  Real-time  information 

•  Low  cost  /  free 

•  Local  expertise 

•  Technology  and  social  media 


Examples 

•  Waze 

•  OpenStreetMap 

•  And  many  others... 


UNIVERSITY 


Where  Innovation  Is  Tradition 


Figure  26.  Training  material  -  Introduction  to  research  framework 


Next,  a  functional  diagram  of  the  reporting  system  procedure,  shown  in 
Figure  27,  introduces  the  process  of  reporting  obstacles  in  a  set  of  ordered 
steps.  The  diagram  portrays  the  reporting  process  as  simple  and  fast. 
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How  It  Works 


Walk  around 
campus  as  you 
regularly  do 
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that  are  in  your 
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Where  Innovation  Is  Tradition 


Figure  27.  Training  material  -  Obstacle  reporting  process 


After  the  reporting  process  is  presented  conceptually  (Figure  27)  each  of 
the  steps  necessary  to  complete  the  reporting  process  is  explained  in  detail 
on  a  series  of  eight  slides  similar  to  Figure  28,  where  the  reporting  inter¬ 
face  is  shown  with  limited-text  labels  highlighting  data  entry  or  interaction 
points.  In  this  case,  the  participant  is  shown  where  to  enter  a  user-ID  and 
the  date/time  of  the  report. 
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Reporting  Obstacles 


1  .Welcome  2.  Obstacle  Location  3.  Obstacle  Type  4.  Upload  Image  5.  Duration  &  Priority  6.  Comments  & 


Welcome  to  the  obstacle  reporting 
system  for  supporting 
accessibility  at  GMIP. 


When  did  you  observe 
obstacle? 


t  9 


UNIVERSITY 


9 


Mu  ion 

Unir*«uly  Other 
of  Su«t«naMity 


Create  a  unique  ID 

-  Use  letters,  numbers,  symbols,  &  others 

-  Do  NOT  use  anything  that  identifies  you  (i.e.  name,  last  name,  G# 

-  Use  the  same  ID  every  time  you  report 


Indicate  date  &  time 

-  When  did  you  observe  the  obstacle? 


geOjgnuKgdu/ygi/ 


Figure  28.  Training  material  -  Obstacle-reporting  online  form 


The  final  section  of  the  training  provides  more  detail  on  the  obstacle  re¬ 
porting  process.  This  section  provides  representative  examples  of  each 
major  obstacle  type  in  our  system.  These  representative  examples  take  the 
form  of  images  selected  through  the  consensus  of  project  staff  to  be  most 
representative  of  each  object  type  (Figure  29).  Participants  learn  to  char¬ 
acterize  obstacles  by  viewing  these  representative  images  and  the  associat¬ 
ed  type  of  obstacle,  the  duration  estimate  and  the  urgency  or  priority  esti¬ 
mate.  Figure  29  shows  a  slide  of  one  of  the  five  examples  given  during  the 
training.  The  image  shows  a  cracked  sidewalk,  representing  the  obstacle 
category  of  “poor  surface  condition,”  and  which  might  pose  a  significant 
inconvenience  to  a  person  with  mobility  impairment,  meaning  the  hazard 
is  of  medium  urgency.  The  duration  category  refers  to  the  amount  of  time 
that  the  hazard  will  be  in  place.  A  hazard  consisting  of  a  cracked  sidewalk, 
which  will  most  likely  require  more  than  seven  days  to  be  fixed,  would  be 
considered  of  long  duration. 


48 


Examples  (cont’d) 
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Obstacle  Type 

Duration 

Sidewalk  obstruction 

Construction  detour 

Low  (<1  day) 

Medium  a -7  days) 

Lone  (>  7  davsl 

Entrance/Exit  problems 

Priority 

Poor  surface  conditions 

Low 

Medium 

Crowd/Event 

High 

Where  Innovation  Is  Tradition 


Figure  29.  Training  material  -  Examples  and  assessment  of  obstacles 


After  explaining  how  to  assess  each  type  of  obstacle,  a  categorization  exer¬ 
cise  is  provided  to  participants.  The  categorization  exercise  has  (in  the 
past)  been  used  to  evaluate  the  quality  of  the  training,  and  how  well  the 
participants  understand  both  the  training  and  the  obstacle  attributes  they 
had  to  report.  The  participants  are  told  that  the  exercise  is  not  a  test  and 
there  was  no  right  or  wrong  answer.  The  categorization  exercise  consists  of 
15  pictures  of  obstacles  (Figure  30),  each  displayed  for  20  seconds,  during 
which  time  the  participants  makes  an  obstacle  type,  obstacle  duration,  and 
obstacle  urgency  assessment.  The  obstacle  pictures  were  taken  within 
GMU  Fairfax  campus  and  the  surrounding  neighborhoods,  in  order  to 
make  the  assessment  realistic  and  reflective  of  the  types  of  obstacles  likely 
to  be  encountered  during  future  report  contributions. 
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Figure  30.  Categorization  exercise  -  Sample  of  the  pictures  of  obstacles 


Participants  are  provided  an  answer  sheet  to  record  their  obstacle  assess¬ 
ments.  The  categorization  answer  sheet  is  designed  using  simple-choice 
answers  in  the  categories  of  obstacle  type,  duration,  and  urgency,  which 
were  the  same  categories  explained  in  the  previous  section  of  the  training. 

Conclusions  and  Summary 

The  purpose  of  our  training  program  is  to  recruit  contributors  and  partici¬ 
pants,  as  well  as  provide  knowledge  of  the  project  and  applications  that 
will  lead  to  higher  quality  contributions.  Efforts  to  improve  training  may 
pay  off  directly  with  higher  quality  data.  Future  versions  of  the  training 
material  for  the  GMU  Geocrowdsourcing  Testbed  will  be  built  using  this 
material  as  a  framework,  and  where  possible,  will  be  incorporated  into  the 
tool  interface,  so  that  small  elements  of  learning  or  training  can  be  done 
while  using  the  tools. 
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Baldwin  and  Ford  (1988)95  note  that  many  training  systems  have  a  signifi¬ 
cant  transfer  problem,  where  trainees  have  a  difficult  time  applying 
knowledge  and  skills  learned  during  training  to  subsequent  tasks  and  jobs. 
They  estimate  that  in  some  industrial  training  settings,  no  more  than  10% 
of  the  training  expenditures  result  in  the  transfer  of  skills  to  a  related  job. 
One  conclusion  drawn  from  Baldwin’s  and  Ford’s  research  is  that  periodic 
review  of  training  materials,  referred  to  as  “booster  sessions,”  as  well  as 
continued  periodic  reinforcement  from  a  trainer,  have  a  positive  effect  in  a 
trainee’s  retaining  the  knowledge  and  skills. 

John  Carroll’s  1990  exploration  of  minimalist  interface  design  and  train¬ 
ing  in  computer  tasks,  “The  Nurnberg  Funnel:  Designing  minimalist  in¬ 
struction  for  practical  computer  skill”96,  and  Farkas  and  Williams  (1990) 
review  of  Carroll’s  work,  suggest  a  few  ideas.97  Carroll’s  critique  of  sys¬ 
tematic  training,  where  tasks  are  broken  down  into  sub-tasks  and  present¬ 
ed  sequentially  in  tutorials  prior  to  working  on  a  task,  ignores  the  comput¬ 
er  user’s  interest  in  quickly  immersing  himself/herself  in  activity  where 
they  can  exercise  their  problem  solving  abilities.  Carroll  suggests  that 
training  materials  should  be  built  from  short,  tasks-specific  modules  ra¬ 
ther  than  lengthy  user-manual  narratives.  While  Farkas  and  Williams 
support  many  of  Carroll’s  assertions  in  his  promotion  of  minimalist  train¬ 
ing  approaches  and  experiential  learning,  they  advocate  flexible  computer 
training  methods  where  diverse  learning  styles  can  be  accommodated. 

The  “learning  while  doing”  aspect  of  training,  which  echoes  some  of  Car¬ 
roll’s  minimalist  ideas,  may  be  effective  in  allowing  users  to  quickly  trans¬ 
fer  knowledge  gained  from  training  immediately  to  the  actual  task  of  con¬ 
tributing  reports.  This  approach  is  similar  to  the  general  way  that  Waze, 
Google,  and  OSM  approach  training.  For  users  with  an  intuitive  under¬ 
standing  of  the  process,  the  training  (broken  into  smaller  embedded  mod¬ 
ules)  could  be  skipped  completely,  while  for  other  users  requiring  more 
detail  and  instruction,  the  embedded  training  modules  could  be  completed 
at  their  own  speed  and  would  provide  the  periodic  review  (or  “booster  ses¬ 
sions”)  discussed  by  Baldwin  and  Ford.  The  current  training  material  for 
the  GMU  Geocrowdsourcing  Testbed  has  a  linear,  fixed-length  format  due 


95  Timothy  T.  Baldwin  and  J.  Kevin  Ford,  “Transfer  of  Training:  A  Review  and  Directions  for  Future  Re¬ 
search,”  Personnel  Psychology  41,  no.  1  (1988):  63-105. 

96  John  M.  Carroll,  The  Nurnberg  Funnel:  Designing  Minimalist  instruction  for  Practical  Computer  Skill 
(MIT  Press,  1990). 

97  David  K.  Farkas  and  Thomas  R.  Williams,  “John  Carroll's  the  Nurnberg  Funnel  and  Minimalist  Docu¬ 
mentation,”  IEEE  Transactions  on  Professional  Communication  33,  no.  4  (1990):  182-87. 
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to  its  reliance  on  PowerPoint.  Allowing  potential  contributors  to  engage  in 
the  training  when  they  need  it,  and  ignore  it  otherwise,  will  be  a  future  di¬ 
rection  for  our  work. 

The  next  chapter  of  this  report  looks  at  two  new  application  areas  for  our 
testbed:  routing  and  visualization.  These  new  application  areas  extend  the 
GMU  Geocrowdsourcing  Testbed’s  capabilities  into  new  areas  of  interest 
to  our  sponsors,  partners,  and  research  staff. 
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4  Extension  of  the  GMU  Geocrowdsourcing 
Testbed:  Routing 

After  losing  his  eyesight  as  an  adult,  Dr.  Reginald  Golledge  and  colleagues 
developed  the  UCSB  Personal  Guidance  System  (Figure  3)  to  help  Dr. 
Golledge  navigate  across  the  college  campus  where  he  worked  (Loomis  et 
al.  2005). 98  The  2003  version  of  the  system  shown  in  Figure  3  consists  of 
a  GPS  receiver,  and  head-mounted  fluxgate  compass,  a  geographic  infor¬ 
mation  system  on  a  laptop  computer,  and  a  handheld  tactile  pointer  inter¬ 
face.  Along  with  tactile  maps  and  graphics,  Dr.  Golledge  could  learn  the 
spatial  layout  and  configuration  of  sidewalks,  buildings,  entrances,  exits, 
and  landmarks  and  successfully  route  himself  across  campus.  A  body  of 
subsidiary  research  was  done  by  Dr.  Golledge  and  colleagues  to  discover 
the  best  methods  for  routing  blind,  visually-impaired,  and  mobility- 
impaired  individuals.  As  noted  in  the  introductory  chapter  of  this  report, 
the  major  drawback  in  the  UCSB  Personal  Guidance  System  is  its  inability 
to  incorporate  transient  obstacles  that  hinder  navigation.  The  purpose  of 
the  GMU  Geocrowdsourcing  Testbed  is  to  provide  this  obstacle  infor¬ 
mation  through  crowdsourcing. 

Nuernberger  (2008)  developed  a  system  for  real-time  communication 
with  mobility-impaired  individuals  to  enhance  their  ability  to  choose 
routes  and  avoid  obstacles.  Barbeau  (2010)  developed  a  notification  sys¬ 
tem  for  communicating  routing  information  to  disabled  individuals  riding 
public  transit.  Kasemsuppakorn  and  Karimi  (2009)  implement  a  wheel¬ 
chair  routing  method  using  the  multiple  parameters  such  as  slope,  side¬ 
walk  conditions,  traffic  loads,  and  other  personal  preferences.  Their  meth¬ 
od  uses  impedance  scores  for  individual  sidewalk  segments  to  determine 
an  optimal  routed  In  a  later  paper  (2013)  they  emphasize  the  importance 
of  developing  accessible  routing  applications  with  a  true  pedestrian  net¬ 
work  rather  than  a  roadway,  and  offer  advice  on  analytical  and  participa¬ 
tory  mapping  approaches  to  accessibility.100  Beale  et  al.  (2006)  implement 


98  Loomis  et  al.,  “Personal  Guidance  System  for  People  with  Visual  Impairment:  A  Comparison  of  Spatial 
Displays  for  Route  Guidance.” 

99  Piyawan  Kasemsuppakorn  and  Hassan  A.  Karimi,  “Personalised  Routing  for  Wheelchair  Navigation,” 
Journal  of  Location  Based  Services  3,  no.  1  (2009):  24-54. 

100  Hassan  A.  Karimi  and  Piyawan  Kasemsuppakorn,  “Pedestrian  Network  Map  Generation  Approaches 
and  Recommendation,”  International  Journal  of  Geographical  Information  Science  27,  no.  5  (2013): 
947-62. 
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a  network  model  in  GIS  for  accessibility  mapping  that  takes  into  account 
slope,  surface  type,  and  the  presence  of  curb  cuts.101 

Several  good  examples  of  accessible  routing  applications  can  be  seen  in  the 
pgRouting  Gallery102,  where  examples  of  open  source  routing  are  linked. 
Two  notable  examples  from  that  page  are  the  Portuguese  Accessible  Paths 
in  Pinhel  portal  (Figure  31)103,  and  the  campus  map  of  the  Federal  Poly- 
technical  School  of  Lausanne  (EPFL,  Figure  32)104.  The  Paths  in  Pinhel 
mapping  portal  allows  for  the  selection  and  display  of  paths  suitable  for 
wheelchair  users,  and  alternatively,  paths  suitable  for  seniors  and  those 
with  minor  mobility  impairments,  taking  into  account  both  slope  and  the 
pathway  material.  The  EPFL  campus  map  allows  for  accessible  routing  be¬ 
tween  and  through  campus  buildings,  avoiding  stairways  and  steep  paths. 
Both  applications  (Figure  31  and  Figure  32)  use  obstacle  avoidance  and 
criteria  for  determining  what  constitutes  an  accessible  route;  in  the  case  of 
Accessible  Paths  in  Pinhel,  these  criteria  are  accompanied  by  explanatory 
text  and  pictures.  Both  applications  may  fail  to  find  an  accessible  route, 
particularly  the  EPFL  campus  map,  which  employs  a  more  sophisticated 
routing  approach  using  interior  passageways.  A  notable  strength  of  the  Ac¬ 
cessible  Paths  in  Pinhel  portal  is  detailed  step-by-step  directions  with 
pathway  widths  and  obstacle  notifications.  A  notable  strength  of  the  EPFL 
campus  map  is  its  extension  of  the  routing  network  to  interior  spaces  and 
the  graphical  floor  indicator  (Figure  32)  showing  end-users  where  they 
will  be  required  to  change  floors. 


101  Linda  Beale  et  al.,  “Mapping  for  Wheelchair  Users:  Route  Navigation  in  Urban  Spaces,"  The  Carto¬ 
graphic  Journal  43,  no.  1  (2006):  68-81. 

102  http://pgrouting.org/gallery.html 

103  See  http://percursos.pinhel.proasolutions.pt/  [accessed  September  24,  2014] 

104  http://plan.epfl.ch/  [accessed  September  24,  2014] 
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Figure  3i.  Wheelchair  accessible  route  generated  by  Accessible  Paths  in 
Pinhel  Mapping  Portal 


Figure  32.  Federal  Polytechnical  School  of  Lausanne  campus  map  with 
accessible  routing 

The  GMU  Geocrowdsourcing  Testbed  adds  to  this  existing  body  of  re¬ 
search  in  this  area  by  demonstrating  how  transient  obstacle  information 
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can  be  collected  through  crowdsourcing,  and  displayed  on  a  map  to  pro¬ 
vide  information  to  individuals.  The  next  logical  step  in  this  process  is  the 
development  of  a  routing  capability  in  the  GMU  Geocrowdsourcing 
Testbed,  to  help  end-users  avoid  the  transient  obstacles  collected  through 
the  system.  Our  work  in  this  effort  is  based  on  the  best  ideas  from  the  ac¬ 
cessible  mapping  resources  reviewed  here  and  the  research  findings  of  au¬ 
thors  such  as  Beale  et  al.,  Kasemsuppakorn,  and  Karimi.  The  following 
sections  of  this  chapter  address  the  development  of  routing  capabilities 
and  some  of  our  preliminary  findings. 

Routing 

Routing  Data 

No  entity  on  campus  or  in  the  region  has  access  to  or  has  created  a  high 
quality  map  for  pedestrian  infrastructure.  OpenStreetMap,  predictably,  is 
the  closest,  with  public  domain  datasets  for  walking  paths  on  campus,  but 
the  connections  of  these  paths  with  neighboring  jurisdictions  is  missing. 
Typical  routing  applications,  even  those  asserted  to  be  for  pedestrians,  uti¬ 
lize  street  networks  (Figure  33). 


Figure  33.  Google  Maps  "pedestrian"  routing  uses  GMU  campus  roadways 

Project  researcher  Eric  W.  Ong  studied  the  pedestrian  routing  capabilities 
within  the  mapping  products  of  Google,  Microsoft,  and  Yahoo,  and 
concluded  that  none  of  them  utilize  pedestrian  networks  over  the  entire 
study  area.  To  fill  the  clear  need,  project  staff  created  a  comprehensive 
pedestrian  network  for  the  region,  and  created  datasets  of  the  related 
pedestrian  features,  such  as  sidewalk  centerlines,  crosswalks,  stairways, 
steep  paths,  bridges,  informal  pathways,  and  curb  cuts  (Figure  34).  Editing 
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and  refinement  of  this  network  is  ongoing  and  updates  are  made  on  a 
weekly  basis,  with  a  focus  on  topological  consistency  and  the  extension  of 
the  routing  network  to  interior  spaces  of  large  buildings  that  are 
commonly  used  during  navigation  across  campus.  The  research  staff  of 
this  project,  GMU  Parking  and  Transportation  Services,  and  GMU 
Facilities  staff  will  jointly  maintain  the  network  and  associated  data.  The 
routing  data  is  part  of  a  larger  network  analysis  and  map  service  deployed 
on  ArcGIS  Server  v.10.1,  Windows  Server,  and  the  Esri  API  for  JavaScript 
with  HTML,  CSS,  and  JavaScript  customization.  This  service  is  under 
development  and  changes  frequently.  It  can  be  found  at: 

http://geo.gmu.edu/route. 


Figure  34.  Pedestrian  Network  (yellow)  superimposed  on  image  of  region 
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Exploring  Routing  Dynamics  with  the  GMU  Geocrowdsourcing  Testbed 

Pedestrian  behavior  on  campus  is  poorly  understood,  especially  the  behav¬ 
ior  of  students,  faculty,  staff,  and  visitors  that  are  disabled.  Only  one  GMU 
staff  member  (interviewed  extensively  for  this  report)  is  trained  and  certi¬ 
fied  as  a  disabled  orientation  and  mobility  specialist.  Having  dealt  with 
similar  problems  in  a  similar  setting  nearly  twenty  years  ago,  project  col¬ 
laborator  James  Marston  published  work  with  Dr.  Richard  Church  in 
2003,  where  they  assert  that  traditional  measures  of  accessibility  are 
flawed,  and  do  not  take  into  account  the  vast  physical  and  mobility  differ¬ 
ences  of  individuals.1^  They  propose  a  sophisticated  system  of  measuring 
access  as  a  way  of  accommodating  these  differences,  as  well  as  accommo¬ 
dating  the  many  structural  barriers  that  affect  travel  time  and  effort.  A 
map  from  their  study  (Figure  35)  shows  that  the  routes  used  by  disabled 
travelers  vary  greatly,  and  have  very  different  distance  and  difficulty  char¬ 
acteristics. 


Figure  35.  Routing  and  Accessibility  Study,  Church  and  Marston  (2003) 


105  Richard  L.  Church  and  James  R.  Marston,  "Measuring  Accessibility  for  People  with  a  Disability,’’  Geo- 
graphical  Analysis  35,  no.  1  (2003):  83-96. 
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To  understand  how  disabled  pedestrians  travel  in  our  area,  we  developed 
the  routing  application  discussed  above,  and  developed  routing  scenarios 
with  several  disabled  travelers.  Figure  36  shows  the  first  routing  scenario 
discussed  with  one  of  our  system  end-users  who  uses  a  wheelchair  to  navi¬ 
gate  on  the  GMU  campus  and  in  downtown  Fairfax  City  near  his  work¬ 
place.  The  scenario  involved  routing  from  the  west  side  of  Fairfax  City  Hall 
to  the  Starbucks  on  the  north  side  of  the  GMU  campus.  The  scenario  uti¬ 
lizes  our  routing  application,  with  the  origin  and  destination  shown  with 
black  crosses  and  the  normal  route  shown  in  red  (dashed  line).  The  impo¬ 
sition  of  an  obstacle  from  our  system  (shown  with  a  red  cross)  significantly 
lengthens  and  extends  the  path  required  to  have  an  accessible  route,  which 
is  shown  with  a  solid  green  line.  The  end-user  verified  that  the  routing 
scenario  shown  in  green  is  realistic,  but  noted  that  it  does  not  consider  the 
slope  of  the  route,  which  is  more  significant  with  the  imposed  detour. 


Figure  36.  The  effect  of  imposing  an  obstacle  on  a  route. 

Figure  37  shows  a  second  routing  scenario  with  the  same  end-user,  where 
we  discussed  his  likely  direction  of  travel  from  an  address  in  downtown 
Fairfax  City  close  to  his  workplace,  to  the  Safeway  store  in  Courthouse  Pla¬ 
za.  As  can  be  seen  in  Figure  37,  our  routing  application  produced  an  acces¬ 
sible  routing  on  a  pedestrian  corridor  through  the  parking  lot  in  front  of 
the  store  (Figure  37,  in  green).  During  discussion  with  the  end-user,  we 
realized  that  a  critical  sidewalk  extension  within  the  parking  lot  (Figure 
38)  was  missing  from  our  underlying  network  routing  dataset.  His  pre- 
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ferred  route  under  this  scenario  would  have  used  this  sidewalk  extension 
and  an  informal  path  through  the  parking  lot,  shown  in  red  (Figure  37). 
This  informal  path  is  slightly  shorter,  but  was  not  chosen  by  our  routing 
application  due  to  the  missing  segment,  highlighting  the  importance  of  in¬ 
formal  and  unmapped  routes  through  large  navigable  areas. 


Figure  37.  The  effect  of  a  missing  sidewalk  section  and  ad-hoc  routing 


Figure  38.  The  missing  sidewalk  section  (pictured)  in  Courthouse  Plaza 
shopping  center,  north  of  GMU  campus. 
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The  end-user  interviewed  during  this  research  activity  requested  that  we 
develop  a  method  (similar  to  those  within  Google  Maps,  Waze,  and  Open- 
StreetMap),  for  adding  new  pedestrian  network  sections.  To  highlight  this 
issue,  the  end-user  pointed  out  a  temporary  third  crosswalk  not  more  than 
too  yards  from  our  interview  location  in  downtown  Fairfax  City,  associat¬ 
ed  with  the  closure  of  sidewalks  and  the  reconstruction  of  Kitty  Pozer  Park 
(Figure  39).  Being  able  to  quickly  accept  reports  of  this  type  and  being  able 
to  modify  the  underlying  pedestrian  network,  as  would  be  needed  in  this 
case,  is  an  important  capability  that  we  will  add  to  our  system  in  the  fu¬ 
ture. 


Figure  39.  New  crosswalk,  created  during  construction  of  Kitty  Pozer  Park. 

Interviews  with  two  other  end-users  produced  similar  results.  The  routing 
scenario  shown  in  Figure  40  involved  a  trip  from  the  vicinity  of  a  work¬ 
place  to  the  GMU  Commerce  Building.  While  the  shortest-cost  route  is  rel¬ 
atively  simple  and  easy  to  visualize  (red  dashed  line),  the  accessible  route 
is  shown  in  solid  green.  This  route  is  specific  to  a  side  of  street  (in  order  to 
avoid  obstacles  located  on  an  opposite  side),  and  directs  the  end-user 
across  a  crosswalk  near  the  GMU  Commerce  Building.  This  unusual  route 


was  discussed  with  the  end-users  and  determined  to  be  valid  but  not  a  pre¬ 
ferred  route  due  to  the  difficulty  of  crossing  the  street  using  the  crosswalk 
on  University  Drive  near  the  GMU  Commerce  Building. 
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Figure  40:  User  routing  scenario  -  obstacle  avoidance  and  side  of  street 

selection 

While  the  routing  scenarios  and  solutions  obtained  from  our  system  were 
deemed  to  be  reasonable  and  valid  according  to  all  end-users  interviewed, 
the  individuals  raised  a  number  of  very  useful  issues  that  we  will  discuss 
here  and  consider  in  future  system  development. 

One  significant  issue  is  that  end-user  route  choice  and  route  preference 
are  highly  variable  and  highly  individual.  This  issue  is  well  known  among 
orientation  and  mobility  specialists  and  the  disabled  community,  but  is 
not  more  broadly  understood.  Church  and  Marston  (2003)106  verily  this 
fact  in  their  study,  as  does  the  work  of  Jacobson  (i998)107,  Golledge 
(1999)108,  Williams  et  al.  (2013)1Q9,  Avila  (2014)110,  and  Golledge  et  al. 


106  |bid. 

107  R.  Dan  Jacobson,  “Cognitive  Mapping  without  Sight:  Four  Preliminary  Studies  of  Spatial  Learning,” 
Journal  of  Environmental  Psychology  18,  no.  3  (1998):  289-305. 

108  Reginald  G.  Golledge,  Wayfinding  Behavior:  Cognitive  Mapping  and  Other  Spatial  Processes  (JHU 
Press,  1999). 
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(2000)* * 111,  who  all  note  the  heterogeneity  in  preferences  associated  with 
route  selection  and  wayfinding  behavior.  This  issue  is  highlighted  promi¬ 
nently  by  Figure  41,  which  shows  the  preferred  route  of  our  first  end-user 
subject,  who  notes  the  many  clear  reasons  why  the  much  longer  route 
shown  in  Figure  41  is  his  preferred  route  to  travel  between  his  dormitory 
and  the  Johnson  Center  on  the  George  Mason  Campus.  An  interesting  as¬ 
pect  of  the  route  shown  in  Figure  41  and  in  other  scenarios  reviewed  with 
this  subject  using  our  testbed  routing  application,  was  that  the  direction  of 
his  travel  made  a  difference  in  the  route  he  selected,  due  to  the  curvature 
and  slope  of  some  paths,  which  made  them  easier  to  traverse  in  a  specific 
direction.  Some  of  the  preferences  stated  by  this  end-user  support  the 
findings  in  Church  and  Marston  (2003),  Kasemsuppakorn  and  Karimi 
(2009,  2013)  and  Beale  et  al.  (2006). 


109  Michele  A.  Williams,  Amy  Hurst,  and  Shaun  K.  Kane,  “‘Pray  before  You  Step  out’:  Describing  Personal 
and  Situational  Blind  Navigation  Behaviors,”  in  Proceedings  of  the  15th  Internationai  ACM  SIGACCESS 
Conference  on  Computers  and  Accessibility  (Bellevue,  Washington:  ACM,  2013),  1-8. 

110  Kimberly  Avila,  “The  Experiences  of  Pedestrians  with  Visual  Impairments  in  a  Metropolitan  Setting: 

An  Ethnographic  Inquiry,”  in  Proceedings,  Biannual  International  Conference  of  the  Association  for  Ed¬ 
ucation  and  Rehabilitation  for  the  Blind  and  Visually  Impaired,  San  Antonio,  TX,  07-14). 

111  Reginald  G.  Golledge  et  al.,  “Cognitive  Maps,  Spatial  Abilities,  and  Human  Wayfinding,”  Geographical 
Review  of  Japan,  Series  B  73,  no.  2  (2000):  93-104. 
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Figure  41.  Preferred  routes  (green)  are  often  longer  and  more  complex  than  a 

shortest  route  (blue) 

Pingel  (2010a,  2010b)  notes  that  asymmetry  in  route  choice  is  common, 
and  the  slope  and  direction  of  travel  and  make  significant  changes  to  route 
selection.  112>n3  Route  choice  appears  to  be  highly  variable  and  personal  and 
would  be  difficult  to  capture  in  our  current  system,  though  some  common 
suggestions,  such  as  the  inclusion  of  elevation  and  slope,  would  be  im¬ 
portant  additions  during  the  next  phase  of  our  work. 

Conclusions  and  Summary 

The  routing  work  profiled  in  this  chapter  represents  a  strategic  extension 
to  our  GMU  Geocrowdsourcing  Testbed,  and  is  of  high  interest  to  local 
campus  and  municipal  authorities,  who  are  struggling  to  accommodate  the 
growth  of  the  University  and  the  strain  this  growth  puts  on  the  local  trans¬ 
portation  infrastructure. 


112  Thomas  J.  Pingel,  "Modeling  Slope  as  a  Contributor  to  Route  Selection  in  Mountainous  Areas,"  Car¬ 
tography  and  Geographic  Information  Science  37,  no.  2  (2010):  137-48. 

113  Thomas  James  Pingel,  Strategic  Elements  of  Route  Choice  (University  of  California,  Santa  Barbara, 
2010). 
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Interesting  examples  of  accessible  routing  applications  profiled  earlier  in 
this  chapter  highlight  the  usefulness  of  incorporating  slope  into  our  rout¬ 
ing  algorithms,  as  well  as  pathway  material,  and  routing  through  the  in¬ 
side  of  buildings  to  take  advantage  of  elevators. 

The  routing  system  discussed  in  this  chapter  is  based  on  a  pedestrian  net¬ 
work  and  obstacle  data  from  our  system.  The  purpose  of  this  routing  sys¬ 
tem  is  to  provide  obstacle-avoiding  route  suggestions  to  disabled  individu¬ 
als.  Future  work  on  this  extension  of  our  testbed  will  be  shared  with  GMU 
Parking  and  Transportation  Services,  as  well  as  other  interested  parties. 
Routing  can  be  difficult,  and  has  been  described  by  a  project  consultant  as 
a  poor  demonstration  of  the  capabilities  of  our  system,  due  to  the  many 
ways  that  an  obstacle-avoiding  routing  application  can  fail  to  work  proper¬ 
ly.  We  acknowledge  and  have  witnessed  the  large  variation  in  routing  and 
wayfinding  preferences  in  individuals  that  have  been  interviewed  for  this 
project,  and  expect  to  continue  seeing  large  variations  in  routing  and  way¬ 
finding  preferences  as  we  expand  our  user  base.  The  goal  of  this  testbed 
extension  is  not  to  meet  every  one  of  those  preferences,  but  to  demon¬ 
strate  the  usefulness  of  crowdsourced  geospatial  data  and  the  possible  uses 
of  the  GMU  Geocrowdsourcing  Testbed. 
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5  Extension  of  the  GMU  Geocrowdsourcing 
Testbed:  Visualization 

MacEachren  describes  visualization  in  the  domain  of  mapping  sciences 
and  geography  as  a  fundamental  geographic  method  associated  with  all 
aspects  of  map  use  in  science,  with  a  specific  focus  on  the  exploration  of 
unknown  phenomena  in  a  highly  interactive,  private  setting.  The  visualiza¬ 
tion  cube  associated  with  this  perspective  (Figure  42)”4  is  well  known  and 
widely  accepted  as  a  way  of  thinking  about  how  computers,  geographic  in¬ 
formation  systems,  and  the  Internet  have  changed  the  traditional  disci¬ 
pline  of  cartography.  The  MacEachren  Geovisualization  Cube  suggests  that 
the  visualization  environments,  such  as  the  one  being  developed  in  the 
GMU  Geocrowdsourcing  Testbed,  are  typically  used  for  exploring  and  re¬ 
vealing  unknowns  through  high  human-map  interaction,  and  usually  in  a 
private  setting.  Those  characteristics,  suggested  by  the  MacEachren  Cube, 
match  our  intended  use  and  development  of  the  GMU  Geocrowdsourcing 
Testbed’s  visualization  capabilities. 


Figure  42.  Geovisualization  Cube,  from  MacEachren  (1994) 


114  Alan  M.  MacEachren  and  David  Ruxton  Fraser  Taylor,  Visualization  in  Modern  Cartography,  vol.  2 
(Pergamon  Press,  1994),  Fig.  1.3,  p.6. 
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Map  Visualizations 

Visualization  is  an  important  element  of  our  current  GMU  Geo¬ 
crowdsourcing  Testbed,  and  while  an  interactive  map  has  been  at  the  cen¬ 
ter  of  the  testbed  for  its  entire  history,  prior  to  2014,  we  did  not  have  the 
capability  for  interactively  visualizing  the  crowdsourced  obstacle  data  and 
associated  quality  assessment  information.  Other  than  the  recent  work  of 
researchers  in  this  project,  very  little  previous  academic  research  is  availa¬ 
ble  about  interactive  visualization  of  pedestrian  navigation  obstacles.  More 
generally,  navigation-centric  applications  such  as  Waze,  and  to  a  lesser  de¬ 
gree,  Google  Maps,  include  event  information  that  in  some  cases  indicates 
an  obstacle.  Similarly,  Travelmidwest.com’s  map  of  Chicago  (Figure  43) 
uses  an  extensive  palette  of  map  symbols  to  show  construction  zones,  inci¬ 
dents,  special  events,  and  weather-related  closures  for  vehicular  travel 
around  the  Chicago  metropolitan  area.  Goldsberry’s  2008  paperns,  based 
on  his  dissertation  work  on  traffic  maps  in  Los  Angeles,  contains  some 
useful  cartographic  ideas  about  symbolization  on  traffic  maps,  which  are 
often  focused  on  arterial  congestion  and  obstacles. 


Figure  43.  Travelmidwest. corn's  traffic  map  of  Chicago 


115  Kirk  Goldsberry,  “GeoVisualization  of  Automobile  Congestion,”  in  AGILE  Workshop  on  GeoVisuaiiza- 
tion  of  Dynamics,  Movement  and  Change,  Girona,  May,  vol.  5  (Citeseer,  2008). 
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Our  GMU  Geocrowdsourcing  Testbed  map  displays  are  based  on  Google 
Map  API  v.3’s  base  data,  and  Esri’s  API  for  JavaScript,  whose  base  data 
layer  is  derived  from  OpenStreetMap.  Application  development  using  a 
generic  mapping  API  and  basemap  provides  some  limitations,  and  repre¬ 
sents  a  classic  problem  for  cartographers,  who  have  gained  the  power  of 
the  Internet  but  have  inherited  mediocre,  fixed  basemaps  that  are  difficult 
to  customize.  As  we  search  for  exemplars  (specifically  web-based  mapping 
and  visualization  systems)  and  discover  noteworthy  cartographic  practic¬ 
es,  we  will  adopt  them  in  our  project. 

Obstacle  Map 

Our  obstacle-mapping  portal  can  be  found  at  http://geo.gmu.edu/vgi  and 
consists  of  a  standard  Google  Maps  API  with  point  and  areal  symbols  add¬ 
ed.  Figure  44  shows  our  current  palette  of  point  symbols  for  representing 
reports  and  obstacles.  The  status  of  each  report  and  obstacle  is  stored  as 
an  attribute  in  our  database,  and  a  corresponding  color  is  used  for  display. 
The  choice  of  simple  colors  is  thought  to  correspond  with  readability  and 
quick  interpretation.  The  use  of  the  neutral  color  gray  for  closed  events  al¬ 
lows  those  reports  to  be  visually  separated  from  the  other  map  content. 
The  larger  bright  red  symbol  for  obstacles  is  chosen  for  emphasis. 


Filter  by  status  1 

0 

Reports  Obstacles 

^  Under  Review  ^ 

^  Confirmed 

@f 

Official  Reports 

^  Closed 

Figure  44:  Our  palette  of  symbols  for  obstacles 
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One  difficult  aspect  mapping  obstacles  in  a  geocrowdsourcing  environ¬ 
ment  is  overlap  and  spacing.  Figure  45  shows  an  area  of  downtown  Fairfax 
City  where  the  spacing  and  density  of  reports  and  obstacles  yields  good  re¬ 
sults.  Several  closed  reports  (in  gray)  and  confirmed  reports  (in  red)  are 
visible  along  with  objects  (in  red).  The  orange  official  reports  from  The 
City  of  Fairfax  Public  Works  department,  indicating  the  sidewalk  closures 
associated  with  the  construction  of  the  Kitty  Pozer  Garden,  are  visible 
along  with  the  polygon-based  footprint  for  the  authoritative  or  official  re¬ 
port. 
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Hair  Cuttery  ■ 

Qdoba 

Mexican  Grill  ■ 


report. 000336(mod) 
9/5/2014  1:15 
sidewalk  obstruction 


Long  (>  7  days) 

High 

Corner  of  North  St.  and  East  St. 

Construction  to  build  Kitty-Pozer  Garden 
includes  a  barrier  along  the  adjacent  sidewalk. 
Sidewalk  is  unusable, 
official  reports 


■ 

Australian 
Landscaping  and 
Construction 
Co  of  VA 


Report  ID: 

Report  date: 
Obstacle  type: 
Obstacle  impact: 
Image: 

Duration: 

Urgency: 

Location  Comment: 
Obstacle  Comment: 
Status: 

Confirm  this  report 


Figure  45.  Obstacle  map  in  the  downtown  area  of  the  City  of  Fairfax 

Figure  46,  in  contrast,  demonstrates  the  significant  problem  with  spacing 
and  density  for  areas  with  many  reports,  such  as  this  walkway  in-between 
Robinson  A  and  the  Fenwick  Library  on  the  GMU  campus,  which  is  under 
construction.  While  the  orange  official  report  can  be  seen  clearly,  many  of 
the  other  confirmed  reports  and  obstacles  overlap,  and  any  information 
pop-ups  (such  as  in  Figure  45)  make  this  cartographic  problem  worse.  Fu¬ 
ture  work  in  visualization  for  the  GMU  Geocrowdsourcing  Testbed  will  in¬ 
clude  solutions  for  automating  the  spacing  and  density  of  contributed  re- 
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ports  in  order  to  improve  obstacle  maps  and  visual  display.  Dias  (2013)116 
and  Dias  et  al.  (2014)11?  address  some  of  these  issues  in  their  geovisualiza¬ 
tion  work  and  mashup  tools,  which  uses  grid-based  clustering  to  simplify 
the  visual  display  of  dense  point  data.  Approaches  adapted  from  Dias  et  al. 
and  other  best  practices  will  be  used  to  improve  the  visual  display. 


Figure  46.  Dense  collection  of  reports  and  obstacles  near  the  Fenwick  Library 
Route  Map 

The  route  mapping  portal  under  development  for  this  project  (Figure  47) 
uses  Esri’s  API  for  JavaScript  and  their  standard  OpenStreetMap  base  lay¬ 
er,  which  has  a  better  representation  of  campus  buildings  and  features 
than  any  of  the  basemaps  available  through  Esri’s  API.  This  base  data  layer 
uses  standard  OSM  symbolization  and  has  generic  mapping  controls  di¬ 
rectly  from  the  Esri  API.  Because  we  have  significant  local  geospatial  data 
and  the  Esri  API  can  facilitate  the  creation  of  a  custom  base  layer,  we  will 


116  Shawn  Bosco  Dias,  “Geovisualisation  Mashup  Tool  to  Provide  Better  Situation  Awareness  for  Earth¬ 
quakes"  (Master’s  of  Science  Thesis,  George  Mason  University,  2013). 

117  Shawn  B.  Dias  et  al.,  “Mashing  Up  Geographic  Information  for  Responding  to  Emergencies  -  An  Ex¬ 
ample  with  Earthquake,”  Journal  of  Geographic  Information  System ,  2014. 
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change  the  cartographic  aspects  of  this  standard  routing  map  in  the  near 
future.  This  interface  allows  us  to  select  origins  and  destinations  with  a 
mouse  click  or  screen  tap,  create  custom  routes  using  stops  and  barriers, 
and  display  the  current  obstacle  data  from  our  system  as  red  crosses, 
which  can  be  avoided.  Future  work  on  this  routing  map  will  include  the 
creation  of  a  base  data  layer  from  our  own  collections  of  data,  as  well  as 
better  point  symbolization  for  origins,  destinations,  and  obstacles. 


Figure  47.  Routing  map,  using  Esri's  API  for  JavaScript  and  OpenStreetMap 
base  data 

Bicycle  Map 

These  routing  tools  follow  the  efforts  of  project  researcher  Jessica  Fayne, 
to  build  a  bike  map  and  related  routing  applications  for  GMU  Parking  and 
Transportation  Services.  Fayne’s  popular  bike  map  (Figure  48),  printed  on 
microfiber  cloth,  is  in  high  demand.  Jessica’s  work  on  this  project  was  ad¬ 
vised  by  project  personnel,  and  funded  by  Fairfax  County  Department  of 
Transportation  and  the  GMU  Parking  and  Transportation  Services  and 
will  continue  during  the  upcoming  year,  funded  partially  by  this  research 
effort.  This  joint  work  reflects  the  strong  interest  in  GMU  Parking  and 
Transportation  Services  to  work  with  our  research  group  to  address  non- 
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vehicular  transportation  options  and  to  address  the  mapping  infrastruc¬ 
ture  to  support  these  options. 


CDOT 


Figure  48.  Jessica  Fayne's  microfiber  bike  map  design 

Fayne  extended  the  bike  mapping  effort  (Figure  48)  with  the  creation  of  a 
GMU  Parking  and  Transportation  Services  Mapping  Portal  (Figure  49), 
which  supports  information  about  bike  sharing,  car  sharing,  and  infra¬ 
structure  used  by  both.  Our  routing  service  will  add  to  these  resources, 
and  will  be  developed  jointly  through  Fayne  with  GMU  Parking  and 
Transportation  Services. 
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Mason  Transportation  Map 

This  map  shows  the  location  of  bike 
features  like  racks  and  pumps,  Zipcar,  and 
Mason  Shuttles  stops  at  Mason, 
more 
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Figure  49.  Jessica  Fayne's  Mason  Transportation  Mapping  Portal 


Moderator  Dashboard 

Our  moderator  dashboard  is  being  designed  and  developed  to  provide  a 
means  of  connecting  our  data  quality  assessment  work  (described  in  Chap¬ 
ter  2  of  this  report)  with  the  map-based  visualization  capabilities  of  GIS 
and  the  statistical  graphics  capabilities  present  in  JQuery,  a  versatile, 
small,  and  fast  JavaScript  library  for  client-side  scripting  of  computer 
graphics.  Our  current  design  (Figure  50)  divides  the  display  into  two  sec¬ 
tions,  with  60%  of  the  horizontal  screen  dimension  allocated  to  the  map 
display  and  the  other  areas  allocated  to  selection  tools  (sliders  for  filtering 
by  report  or  obstacle  parameters)  and  statistical  graphics.  The  moderator 
dashboard’s  visualization  tools  are  designed  using  JavaScript,  AJAX, 
HTML,  and  CSS,  on  top  of  the  same  PostgreSQL  database  (v.9.2)  used  for 
our  data  collection  tools. 

We  developed  this  tool  to  identify  and  visualize  aspects  of  data  quality  that 
have  unique  temporal  or  spatial  properties. 
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Crowdsourced  Geospatial  Data  —  Geovisualization 


Time  of  Day:  i  1:00  -  16:00 


Elements  (select  1-2  elements  to  generate  the  charts) 

^Recent  6  Months  Recent  8  weeks  Recent  5  days 

^Quality  Scores  Source:  Campus/City/County  Official  Source:  Crowdsourced 


Figure  50.  Visualization  tool  (under  development), 
http://geo.gmu.edu/viz.htmi 

Similar  data  visualization  tools  include  Tableau  Public,  a  free  web-based 
interactive  charting  and  graphing  application.  Novel  exemplars  from  Tab¬ 
leau  Public  include  graphics  such  as  Figure  51,  which  shows  user-IDs  for 
contributors  to  our  system,  and  displays  the  quality  of  their  report  contri¬ 
butions  through  time.  In  the  upcoming  weeks  we  will  be  implementing  a 
variety  of  map-based  and  chart-based  data  visualization  tools  to  explore 
the  spatial  and  temporal  dimensions  of  data  quality. 


Figure  51.  Tableau  Public:  Change  in  QA  Quality  by  ID 
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Conclusions  and  Summary 

The  Moderator  Dashboard  is  an  important  extension  of  our  GMU  Geo¬ 
crowdsourcing  Testbed.  Visualization  represents  the  emergence  of  an  in¬ 
teractive,  scientific  approach  to  the  traditional  discipline  of  cartography, 
and  has  become  an  important  part  of  a  web-based  system  for  analysis.  Our 
visualization  tools  form  the  critical  connection  between  our  quality  as¬ 
sessment  activities  and  our  mapping  activities.  The  ability  to  create  map 
and  chart-based  displays  of  report  and  obstacle  data  will  help  discover  un¬ 
knowns,  as  envisioned  by  the  MacEachren  Cube,  and  will  lead  us  toward 
improvements  in  our  work.  Where  possible,  we  will  find  and  emulate  the 
best  examples  of  data  visualization. 
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6  Conclusions  and  Future  Directions 

In  earlier  phases  of  this  multi-phase  research  effort,  we  conducted  a  signif¬ 
icant  review  and  characterization  of  the  state-of-the-art  in  crowdsourcing 
geospatial  data  (Rice  et  al.  2012a).  This  work  has  led  us  toward  good  ideas 
and  best  practices  that  have  informed  our  design  and  implementation  of  a 
system  for  crowdsourced  data  collection,  described  in  Rice  et  al.  2013.  This 
report  describes  our  effort  to  build  a  system  for  quality  assessment,  based 
on  the  best  practices  and  associated  science,  a  program  for  recruiting  and 
training  participants,  and  extensions  of  our  testbed  in  the  areas  of  accessi¬ 
ble  routing  and  visualization. 

Methods  for  geospatial  data  quality  assessment  have  developed  over  the 
past  seven  decades,  and  have  evolved  along  with  GIS.  The  current  meth¬ 
ods,  built  on  the  concepts  of  National  Standard  for  Spatial  Data  Accuracy 
(NSSDA)  and  other  best  practices,  include  considerations  for  positional 
accuracy,  attribute  accuracy,  completeness,  logical  consistency,  semantic 
accuracy,  temporal  accuracy,  lineage,  and  usage.  Girres  and  Touya  (2010), 
and  Haklay  (2010)  provide  useful  applications  of  traditional  data  quality 
concepts  to  geocrowdsourced  data,  primarily  OpenStreetMap.  Goodchild 
and  Li  (2012)  outline  three  general  methods  for  quality  assurance  in 
crowdsourced  geospatial  data,  one  of  which  (the  social  approach)  matches 
our  approach  for  quality  assurance.  The  use  of  trained  student  moderators 
to  provide  a  comprehensive  quality  assessment,  based  on  best  practices, 
allows  us  to  assess  the  quality  of  the  positioning  and  attributes  of  reports. 
These  moderator-led  data  quality  actions,  outlined  in  Chapter  2  of  this  re¬ 
port,  result  in  general  quality  measures  that  we  use  to  analyze  the 
crowdsourced  geospatial  data  and  explore  geocrowdsourcing  dynamics  in 
our  system. 

Project  researcher  Fabiana  Paez  conducted  an  extensive  review  of  training 
approaches  for  crowdsourced  geospatial  data,  concluding  that  an  embed¬ 
ded,  modular  approach  would  benefit  our  project,  similar  to  the  approach¬ 
es  used  by  Google  Map  Maker  and  Waze.  Through  her  work,  we  have 
trained  more  than  200  potential  contributors,  some  of  whom  have  con¬ 
tributed  to  our  system.  Based  on  the  insights  and  conclusions  in  her  Mas¬ 
ter’s  Thesis,  we  will  revise  our  training  program  to  integrate  it  with  our  da- 
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ta  contribution  tools,  which  matches  some  of  the  practices  advocated  by 
researchers  interested  in  the  effectiveness  of  computer  training  programs. 

Extensions  of  our  GMU  Geocrowdsourcing  Testbed  include  accessibility 
routing  (discussed  in  Chapter  4)  and  visualization  (discussed  in  Chapter 
5).  The  accessibility  routing  work,  which  is  of  high  interest  to  local  collabo¬ 
rators,  has  shown  to  be  interesting  and  challenging,  due  to  the  highly  vari¬ 
able  preferences  and  individual  decisions  associated  with  route  choice.  A 
few  of  these  significant  preferences,  such  as  slope  and  curvature,  will  be 
implemented  within  our  GMU  Geocrowdsourcing  Testbed  routing  exten¬ 
sion.  The  visualization  extension  to  our  GMU  Geocrowdsourcing  Testbed 
has  a  purpose  of  connecting  our  data  quality  metrics  to  our  mapping  sys¬ 
tem.  Innovative  statistical  graphics  are  being  developed  to  help  illuminate 
and  explore  spatial  and  temporal  relationships  in  our  data. 

Final  items  for  future  work  include  the  following  topics  that  have  emerged 
during  our  work  this  year.  We  plan  on  conducting  an  analysis  of  the  influ¬ 
ence  that  map  base  layer  (and  its  level  of  detail)  have  on  the  quality  and 
completeness  of  information  contributed  to  our  testbed.  Additionally,  we 
plan  on  conducting  a  series  of  checks  for  moderator  consistency  in  defin¬ 
ing  the  “ground  truth”  for  position  and  attributes  of  reports,  which  is  a 
foundational  element  of  our  quality  assessment.  We  will  also  look  closely 
at  the  influence  of  the  computer  input  device  (mobile,  tablet,  desktop)  on 
the  precision  and  accuracy  of  report  location,  and  plan  on  extending  this 
idea  to  include  other  positional  indicators,  such  as  embedded  image  geo¬ 
tags  and  text-based  location  description.  These  future  research  areas  will 
help  us  understand  the  dynamics  of  geocrowdsourcing  and  will  help  us 
improve  the  GMU  Geocrowdsourcing  Testbed. 
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